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Db 18215 ATATAAATATCTATTAATTTATAATTAGTATATAGTTTTTTTTTAAAAAAAAAATTATTT 18156 

Qy 2084 ttgaacataattatttgacaataattaagttttctagggaataaacggaaatatcttctt 2143 

II I II I III I II llll I I I I II I Mil II 
Db 18155 TTTTTAAAAAATTTTTTAAAAAAATTGAAAAATAAATAAATTATATTTCATTATAAAATT 18096 

Qy 2144 cttttttgtaaaattactaatgcaagaacaaacaacgttttggggagcaaataatctagc 2203 

III I llllll III II II I II llll 
Db 18095 TATTTATTAAAAATTTTTTGTTTATTTTTTAAAAAACATGATTTTATTATATAAATATTT 18036 

Qy 2204 tttaagtagtcagtgtaactctcaaaatctggtcataacttctaggctgagtttgctgtg 2263 

llll II I II I I I II I I I II II III 
Db 18035 TTTA--TAAAAAIAATACMTTAAGAAATTTTTAAAAAATTTATATTAAATTATTTAAAT 17978 

Qy 2264 ctacagtagtaagtctatagaaacttacctgacaaaacgacatgacgtcagggtcgaatc 2323 

I I llll I I II I I I II I I I I 
Db 17977 AAITTAATTTTTCTATATATATATATATATTATATAAATATTCAATAATATATAAATTTA 17918 

Qy 2324 tacaacttttcctttttcttcaattaacatatggttgattcaagttccgatctataataa 2383 

II I I I llll HIM llll II II II II II HUM 
Db 17917 TAAATATATAATAATTAATTAAATTATTATATTATTTATATAAATT-AAATTAATAATAA 17859 

Qy 2384 tttattacgatttatcaatttcaattaccttatatcatcctattataaatataagtcagt 2443 

II II II II II II lllllll I I II III III I 

Db 17858 ATAAATATGAGAATATAAATTTTTATAAATTATATCTACATTTTTAAATTTTAAAATTTT 17799 

Qy 2444 tcaattcagttttcgaaagttcccaaaaattttgaattttattaaatttattccctaaaa 2503 

I I II I II I II II II II III II II II III I 

Db 17798 TTATTTAAATTATTAGATATATAATAATATATTAAATATTTATATATATATAAATATCTA 17739 

Qy 2504 ccgaaatagttatatctttcaaatttaagtttcatttttcaatccgatttcaatttcatc 2563 

I II II I I III II I II III I II 

Db 17738 TTAATTTATAATTAGTATATAGTTTTTTTTTAAAAAAAAAATTATTTTTTTTAAAAAATT 17679 

Qy 2564 cttttataactctctattatctataattacataaatttcaaattaattttgaaatattta 2623 

llll III I I II llll I II llllll III II I llll I 
Db 17678 TTTTTTTAAAAATGAAAAATAAATAAAT- - -TATATTTCATTATAAAATTTATTTATTAA 17622 

Qy 2624 cactttagtccctaagttcaaaactataaattttcactttagaaattaatcatttttcac 2683 

I III I II II III II III I I II Ml III 
Db 17621 AAATTTTTTGTTTATTTTTTAAAAAACATGATTTTATTATATAAATA TTTTTT 17569 

Qy 2684 atctaagcatcaaatttaaccaaatgacacaaatttcatgattagttagatcaagctttt 2743 

II III I MM II I I III II II II III I II III 

Db 17568 ATAAAAATAATACATTTAAGAAATTTTTAAAAAATTTATATTAAATTATTTAAATAATTT 17509 

Qy 2744 gagtcttcaaaacataaaaattacaaaaaaaaaacaaacttaaaatcatttatcaatttg 2803 

lllllll III I II I III III I II I II II I 

Db 17508 AATTITTCTATATATATATATATATTATAAATATTCAATAATATATAAAITIATAAATAT 17449 

Qy 2804 aacaacaaagcttggccgaatgctaagagcttaaaaatggcttcttttgtttctttttgt 2863 

III II I I II I Mil I III II I II I 

Db 17448 ATAATAATTAATTAAATTATTAAAAAAAAAAAAAAAAAAAAATAATTTTTTATTATTAAT 17389 

Qy 2864 tgcaaacggtggagagaagagggaaatgaagattgaccatatttttttattatgttttaa 2923 

I I I II llll II II I I I! I I llll 
Db 17388 TAAATATTAGTAATAAATAAATTTTATTTAATTATTAGTTATTAAATTTAT 17338 

Qy 2924 catataatattaataatttaatcataattatactttggtgaatgtgacagtggggagata 2983 

II II I I I Mil II I II I III I III II II 

Db 17337 TTATTATTAATTAAATATTAATAATGAATAGATTTTATTTAATTATTAATTATTAAATTT 17278 

Qy 2984 cgtaaagtattttaacattatactttttgcaagcagttggctggtctacccaagagtgat 3043 

llll I I I III II 'II I I II 

Db 17277 ATTTATTATTAAAATTTAAAMTTATTTTCATTTTAATATATATATATATATATATATAT 17218 

Qy 3044 caaagtttgagctgccttcaatgagccaatttttgcccataat---ggataaaggcaatt 3100 

III I II II I III Mil MM I II 

Db 17217 AAITTTTATTAATTATTTTAAATAATTTTATTTATAAAATAATTTATTATAAAAATAGTI 17158 

Qy 3101 tgtttagttcaactgctcacagaataatgttaaaatgaaattaaaataaggtggcctggt- 3160 

I II III II I I I III I II II III I I II 
Db 17157 TATTAAGTATAATTTAATAAATAATTTTTTTTTAAAAAAAAATATTTTTTTAAGTTTTAA 17098 



Qy 3161 cacacacacaaaaaaaaactaatgttggttggttgaattttatattacggaatgtaatat 3220 

I III I I I I I I Mil II II II 

Db 17097 TTATATAATAAATTTATGAATAGGGGGAATAAATTTATTTTCATTTTACATATATATATA 17038 

Qy 3221 tatattttaaaataaaattatgttatttagattcttaatattttggagcattccatacta 3280 

Mil I I III Mil I III Mil I II llll 
Db 17037 TATATATATATATACAATTAATTAATTCAGATTTAGTGATTAAAATAAATTATTTTATTA 16978 

Qy 3281 taatttcgtaacataatattaaaatatagta-atataaagtgtaattaactttaaattac 3339 

II II III llll III II I II II II I lllllll 
Db 16977 TACTTATATAATTTAATTGAAAATTAAATTATATGTATATATATATAAATATATGAATTG 16918 

Qy 3340 aagcataatattaaattttgaatcaattaatttttatttctattattttaattaatttag 3399 

II I III llll I I II II llll I llllll lllllll 
Db 16917 AATTTTTATAAAAAATCATTTTAAATTTTTATTATATTAAAAATATTTTTATTAATTATT 16858 

Qy 3400 tctattttttcaaaataaaatttaaatctaaataaaaataatttttccttaatgttgaaa 3459 

I I I II I II II III I MUM! Ill I II I II 
Db 16857 TAAAATAATTTTATTTATAAAATAATTTATTATAAAAATAGTTTATTAAGTATAATTTAA 16798 

Qy 3460 caactcatgttatacttcaaaattataagtattatatttaccttgatgatttatttatta 3519 

II llll II I I llll I I I II I II III II I I 

Db 16797 TAAATCATTTTTTTTTAAAAAAAAAATATTTTTTAAGTTTTAATTATACAATAAATTTAT 16738 

Qy 3520 gtatattaattctgattataattatggtgggatacaatcgctttccactaaatattttaa 3579 

I III I I I II II I II I I II III llll 

Db 16737 GAATAGGGGGAATAAATTTATTTTCATTTTTTTATATATATATATATATATATA-ATTAA 16679 

Qy 3580 ctatgatttataaatttatttcaacatcgtatatttacttattaatacataatttatcat 3639 

III II I I III II I III II llll llll I II 

Db 16678 TTATTTCAGATTTAGTGATTAAAATAAATTMTTTAITATATTTATATAATTTAATTGAA 16619 

Qy 3640 aattttatggaaattgagaccaagaaacattaagagaacaaattctataacaaagacaat 3699 

llll I I II I I I I I II II I III III 

Db 16618 AATTAAAATTATATATATATATATATATATAAA TATAAATTGAAT 16574 

Qy 3700 ttagaaaaaaatgtacttttaggtaattttaagtactcttaaccaaacacaaaaattcaa 3759 

II Mil llll I III III I I III Til II - 
Db 16573 TTTTTAAAAATTATTTTTAATTTTTATTATAATAAAAATATTTCTTATTAATTATTTTAA 16514 

Qy 3760 atcaaatgaactaaataagataatataacatacggaacatcttacttgtaatcttacatt 3819 

II I I I II I II I III I I I I I III I MM"! I I 
Db 16513 ATAATTTTATTTATAAAATAATTTATTATAAAAATAGTTTATTAAGTATAATTTAATA-- 16454 

Qy 3820 cccataattttattatgaaaaataatcttatattactcgaactaaatgttgtcacaaatt 3879 

nniiii ii i in: III III II III II I II 

Db 16455 - -AATAATTTTTTTTTTAAAAAAAATATTTTTTT AAGTTTTAATTATATAATAAA 16403 

Qy 3880 attatctaaataaagaaaaacacttaatttttataacattttttcatatatttgaaagat 3939 

llll II III II I II I I llll I I llllll I I 
Db 16402 TTTATGAATAGGGGGAAIAAATITATTTTCATTTTACATA1ATATATATATATATATATA 16343 

Qy 3940 tatattttgtatatttacgtaaaaatatttgacatagattgagcaccttcttaacataat 3999 

II I II I III I I I III I I I llll Mil 
Db 16342 TACAATTAATTAATTCAGATTTAGTGATTAAAATAAATTATTTTATTATACITATATAAT 16283 

Qy 4000 cccaccataagtcaagtatgtagatgagaaattggtacaaacaacgtggggccaaatccc 4059 

I II I II I I I I II I I I I II II 
Db 16282 TTAATTGAAAATTAAATTATATGTATATATATATAAATATATGAATTGAATTTTTATAAA 16223 

Qy 4060 accaaaccatctctcattctctcctataaaaggcttgc-tacacatagacaacaatcca 4117 

I I I I II I I llll II II II I II M 
Db 16222 AAATCATTTTAAATTTTTATTATATTAAAAATATTTTTATTAATTATTTAAAATAATTTT 16163 

Qy 4118 cacacaaatacacgttcttttctttctatttgattaaccatggctcatagcattcgtcac 4177 

II I I II II I II II Mil II I I II I 

Db 16162 ATTTATAAAATAATTTATTATAAAAATAGTTTATTAAGTATAATTTAATAAATCATTTTT 16103 

Qy 4178 cctttcttccttttccaacttttactcataagtgtctcactagt gaccggtagcc 4232 

II III I I III I I I II I II I 

Db 16102 TTTTAAAAAAAAAATATTTTTTAAGTTTTAAITATACAATAAATIIATGAATAGGGGGAA 16043 
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Db 9135 ATATATTTAATTATTTAATATAATATAMTTGTTTTATTTATTAAATTATTATATTAAT- 9075 

Qy 3079 cccataatggataaaggcaatttgtttagttcaactgctcacagaataatgttaaaatga 3138 

I I II III I I I I II! II! I I II I 
Db 9076 ATATTTTTATTTATTTAAAATATTATATTAAATAATAAAGACAATATATTAAACATATAA 9017 

Qy 3139 aattaaaataaggtggcctggtcacacacacaaaaaaaaactaatgttggttggttgaat 3198 

I I I I I I I II II II I II II II III 

Db 9016 ATTAATATAATATTTTATTTTATTTATTAATTTGTTTAATATATTATTATTTTATTTAAT 8957 

Qy 3199 tttatattacggaatgtaatattatattttaaaataaaattatgttatttagattcttaa 3258 

I I mi i ii mi urn i ii nun inn i m 

Db 8956 TATTTATTTAATATATTATTATTTTATTTAATTATTTAATTATATTATTATTTTATTTAT 8897 

Qy 3259 tattttggagcattccatactataatttcgtaacataatattaaaatatagtaatataaa 3318 

II III I II III II III I I 1 1 1 1 II III I I II III 
Db 8896 TAATTTATATATTTTAATATATTATTTTATTTATTTAATATAATATTTAATTTATTTAA- 8836 

Qy 3319 gtgtaattaactttaaattacaagcataatattaaattttgaatcaattaatttttattt 3378 

I Mil I III lllll II III I III IIM! I II 

Db 8837 -TATAATAATATTTTAATTATTTAATATAATATTTCATTTAATTCATTTAATATAATATT 8779 

Qy 3379 ctattattttaattaatttagtctattttttcaaaataaaatttaaatctaaataaaaat 3438 

II I I I HUM!!! Ill Mill I III I I 

Db 8778 TTAATTATATTATTAATTTATATATTTTAACATTTTATTTAATTTATTTAATATATATAA 8719 

Qy 3439 aatttttccttaatgttgaaacaactcatgttatacttcaaaattataagtattatattt 3498 

lllll I I! I I I IIM II II III II II I Mill 
Db 8718 TATTTTATTTATTAATTATATTATTAATTTATATATTTTAATATTTTATTTAATTTATTT 8659 

Qy 3499 accttgatgatttatttattagtatattaattctgattataattatggtgggatacaatc 3558 

i i ii mm iiiiim i i iiiii i m ii 

Db 8658 AATATAATATTTTATT TATTAATTATATTATTAATTTATTTAATATAATATT 8607 

Qy 3559 gctttccactaaatattttaactatgatttataaatttatttcaacatcgtatatttact 3618 

II III Ml I I III I II I II I I II lllll I 
Db 8606 TTATTTATTTAATTATATATATTATTAATTTATATATTTTAATATTTTATTTAATTTATT 8547 

Qy 3619 tattaatacataatttatcataattttatggaaattgagaccaagaaacattaagagaac 3678 

MM I II III lllll III I II I I I 
Db 8546 TAATATAATATTTTATTTATTAATTATATTATTAATTTATTTAATATAATAGTTTATTTA 8487 

Qy 3679 aaattctataacaaagacaatttagaaaaaaatgtacttttaggtaattttaagtactct 3738 

MM I I I II II I 1 1 III IMMI I 
Db 8486 TTTAATTATATATATTATTAATTTATATATTTTAATATTTTATTTAATTTATTTAATATA 8427 

Qy 3739 taaccaaacacaaaaattcaaatcaaatgaactaaataagataatataacatacggaaca 3798 

I I I MM III II II I II I I II I 
Db 8426 ATATTTTATTTATTAATTATATTATTAATTTATATATTTTAATATTTTATTTAATTTATT 8367 

Qy 3799 tcttacttgtaatcttacattcccataattttattatgaaaaataatcttatattactcg 3858 

I II I MM II III lllll lllll! II II II I! II I 
Db 8366 TAATATATATAATATTTTATT-TATTAATTATATTATTAATTTATATATTTTAATATTTT 8308 

Qy 3859 aactaaatgttgtcacaaattattatctaaataaagaaaaacacttaatttttataacat 3918 

I III I lllll MM I I II II II III IMMI I 

Db 8307 ATTTAATTTATTTAATATAATATTTTATTTATTAATTATATTATTAA- • -TTTATATATT 8251 

Qy 3919 tttttcatatatttgaaagattatattttgtatatttacgtaaaaatatttgacatagat 3978 

II I I MM! I IIIIIM! I lllll I IIIIIM I I I 

Db 8250 TTAATATTTTATTTAATTAATTATATTATTAATTTATATTTTTTAATATTTTATTTTATT 8191 

Qy 3979 tgagcaccttcttaacataat 3999 

II Mil III I 
Db 8190 TTATTTATTTAATATAATATT 8170 



RESULT 15 
DMU37541/C ■ 

LOCOS DMU37541 19517 bp DNA circular INV 04-APR-2000 

DEFINITION Drosophila melanogaster complete mitochondrial genome. 

ACCESSION U37541 

VERSION 037541.1 GI; 1166529 



KEYWORDS 
SOURCE 
ORGANISM 



TITLE 

JOURNAL 

MEDLINE 



MEDLINE 
REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 



AUTHORS 
TITLE 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 



AUTHORS 
TITLE 



REFERENCE 
AUTHORS 



JOURNAL 
MEDLINE 



AUTHORS 
TITLE 



JOURNAL 
MEDLINE 



TITLE 

JOURNAL 
MEDLINE 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 



Drosophila melanogaster, 

Mitochondrion Drosophila melanogaster 

Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 

Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; 

Muscomorpha; Ephydroidea; Drosophilidae; Drosophila. 

1 (bases 12511 to 12682) 

Clary,D.O,, Goddard, J.M. , Martin, S.C., Fauron,C,M, and 
Wolstenholme,D.R. 

Drosophila mitochondrial DNA; a novel gene order 
Nucleic Acids Res. 10 (21), 6619-6637 (1982) 
83090428 

2 (bases 5269 to 5695) 

Clary, D.O., Wahleithner,J.A. and Wolstenholme,D.R. 

Transfer RNA genes in Drosophila mitochondrial DNA: related 5' 

flanking sequences and comparisons to mammalian mitochondrial tRNA 

genes 

Nucleic Acids Res. 11 (8), 2411-2425 (1983) 
83220794 

3 (bases 404 to 5272) 
de Bruijn,M.H. 

Drosophila melanogaster mitochondrial DNA, a novel organization and 
genetic code 

Nature 304 (5923), 234-241 (1983) 
83245048 

4 (bases 804 to 1778) 

Satta,Y., Ishiwa,H. and Chigusa,S.I. 

Analysis of nucleotide substitutions of mitochondrial DNAs in 

Drosophila melanogaster and its sibling species 

Mol, Biol. Evol. 4 (6), 638-650 (1987) 

88174373 

5 (bases 5268 to 13619) 
Garesse,R. 

Drosophila melanogaster mitochondrial DNA: gene organization and 
evolutionary considerations 
Genetics 118 (4), 649-663 (1988) 
88212147 

6 (bases 441 to 2967) 
Satta,Y. and Takahata,N. 

Evolution of Drosophila mitochondrial DNA and the history of the? 
melanogaster subgroup Z 
Proc. Natl. Acad. Sci. U.S.A, 87 (24), 9558-9562 (1990) 
91088557 

7 (bases 14215 to 14512) 

Ballard, J.W., 01sen,G,J., Faith,D.P., Odgers,W.A., Rowell,D.M. and 
Atkinson, P. W, 

Evidence from 12S ribosomal RNA sequences that onychophorans are 

modified arthropods 

Science 258 (5086), 1345-1348 (1992) 

93088057 

8 (bases 14917 to 19517) 

Lewis, D.L., Farr,C.L., Farquhar,A,L. and Kaguni,L,S. 
Sequence, organization, and evolution of the A+T region of 
Drosophila melanogaster mitochondrial DNA 
Mol. Biol. Evol. 11 (3), 523-538 (1994) 
94285822 

9 (bases 1 to 408; 13319 to 19517) 
Lewis, D.L., Farr,C.L. and Kaguni,L,S. 

Drosophila melanogaster mitochondrial DNA: completion of the 
nucleotide sequence and evolutionary comparisons 
Insect Mol. Biol. 4 (4), 263-278 (1995) 
96423163 

10 (bases 1 to 19517) 

Lewis, D.L., Farr,C.L. and Kaguni,L,S. 
Direct Submission 

Submitted (03-OCT-1995) Laurie S, Kaguni, Biochemistry Department, 
Michigan State University, East Lansing, MI 48824-1319, USA 

Location/Qualifiers 

1. .19517 

/organism-'Drosophila melanogaster" 
/organelle- "mitochondrion" 
/db_xref-"taxon:7227" 

/note- "derived from new and previously submitted 
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PFMAL1P3/C 

LOCUS PFMAL1P3 67970 bp DNA INV 15-DEC-1999 

DEFINITION Plasmodium falciparum MAL1P3, complete sequence. 

ACCESSION AL031746 

VERSION AL031746.9 GL6594243 

KEYWORDS HTG, 

SOURCE malaria parasite P, falciparum, 
ORGANISM Plasmodium falciparum 

Eukaryota; Alveolata; Apicomplexa; Haemosporida; Plasmodium. 
REFERENCE 1 (bases 1 to 67970) 
AUTHORS Bowman, S., Churcher,C, Harris, B., Harris, D., Lawson,D., Quail, M. 

and Barrell, B, 
TITLE Direct Submission 

JOURNAL Submitted (24-SEP-1998) P. falciparum Genome Sequencing Consortium, 
The Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambridge 
CB10 ISA, UK 

COMMENT On Dec 16, 1999 this sequence version replaced gi:5763807 . 

For more information about this sequence or the Malaria Project, 
see http://www.sanger.ac.uk/Projects/P_falciparum. IMPORTANT: This 
sequence is unfinished and does not necessarily represent the 
correct sequence, Work on the sequence is in progress and the 
release of this data is based on the understanding that the 
sequence may change as work continues. The sequence may be 
contaminated with foreign sequence from E.coli, yeast, vector, 
phage etc. 

FEATURES Location/Qualifiers 
source 1. .67970 

/organism- "Plasmodium falciparum" 

/strain- "3D7" 

/dbjcref-"taxon:5833" 

/chromosome-" 1" 
gene complement(1748. .3276) 

/gene-"MALlP3.01' 

CDS complement (join (1748. .2598,2748. .2848,2990. .3276)) 

/gene-"MALlP3.01" 

/note-"MALlP3.01, conserved hypothetical protein, len: 412 
aa, similarity: UPF0006 family eg to 
YBL055CABL0512/YBL0511, YBF5JEAST (418 aa), fasta 
scores: opt: 316, E(): Ue-12, (33,2% identity in 271 aa 
overlap) " 
/codon_start-l 

/product- "conserved hypothetical protein, UPF0006 family" 

/protein_id-"CAB63556.1" 

/dbjcref-"GI: 6594244" 

/translation-'MKLVFHYIKYlNVLFYISIIFLKSNSLKIYNDLRYISTVEYKV 
LQIKKRSNLKKNHNIRKMEDNESSFIDIGSMLTDKMFDGVYNSRKHENDLQNVLNRAK 
NNNVDKIIITCTCLAEIDKSLKICETYDPEGKFLYLSAGVHPTNCYEFIDKEHEEKE 
IIAKKEYEEFIKYFRNEQVENSKMENGNKKICDGEKDMNNLNEILLERNLDTIPGFRY 
NERDKEYLENLRNKIIKYPNRIVCIGEIGLDFDRLYFCSKYIQIRYFIFQLRLVOMFN 
LPMFLHMRNCSETFFKIVDIYKFLFEKNGGVIHSFTDKEDIVHIIVQNYKNLYIGVNG 
CSLRSLENINAVRKIPLNLLLLETDAPWCGVKRTHASYEYIKDTYERRAYTNLRKIKN 
I IKCDDNT IFKERNEPYNI A " 
miscjeature complement(2599, ,2610) 
/gene-"MALlP3.01" 

/note- "potential splice acceptor sequence" 
miscjeature complement (274 2. .2747) 
/gene-"MALlP3.01" 

/note- "potential splice donor sequence, atg/gttaaa" 
misc.feature complement (284 9. .2861) 
/gene-"MALlP3.01" 

/note- "potential splice acceptor sequence" 
miscjeature complement! 2984. .2989) 
/gene-"MALlP3.01" 

/note- "potential splice donor sequence, aaa/gtaaaa" 
gene 5005. .5496 

/gene-"MALlP3.02" 
CDS 5005. .5496 

/gene-"MALlP3.02" 

/note-"MALlP3.02, hypothetical protein, len: 163 aa,; 
contains possible signal sequence" 
/codon_start-l 

/product- "hypothetical protein, MAL1P3.02" . 
/protein_id-"CAB63557.1" 



/db_xref-"GI: 6594245" 

/trans lation-'MKLLNNRFWLCPIIILFFFLNSWLGNNNRNNINFHETENAAK 
AMRRLLSGEINSIKLDNGDELKIKLNDEKHKDSTRWDRSYSFISNLEEEKYSQTDLFR 
KKQEINEANTKnEDRQEFYILNNDEIENIATRFVLENNFDELYIQSFKQSLIDIIQS 
LNN" 

miscjeature 8020. .10389 

/note-"possible cenl, region of very high [a+t] content" 
gene 14884. .20352 

/gene-"MALlP3,03" 
CDS 14884. .20352 

/gene-"MALlP3.03" 

/note-"MALlP3.03, putative ABC transporter, len: 1822 aa" 
/codon_start-l 

/product-'putative ABC transporter" 
/proteinJd-"CAB63558,l" 
/db_xref-"GI: 6594246" 

/translation-'MTTYKENVGISNKGNKKKKSCQNISFLNFLSFDWIRPLINDLIR 
GDIQELPNICRNFDVPYYASKLEENLRDIEVEDSEFYSEKNSSNEHVLHHCNSNDASE 
KKVYNVYYHNILWSILKTFKFRIILIISFYILETLIVTLGGKFIDYYMRILEGQKIPV 
YISFLKDFKVFSGLVWMIMFFHLFFEALLHFYFHLFTINLKVSLMYFLYKINLCSNN 
NHLQNPDAFYNTYRRFSSQTEIDEISRDFLSIGRNASSSSSGIRNNNKNIDNNKFVEN 
DYIINFIRSTRRMERDSLNENRSLPNVNIYNIMFSDVPSVTFFVTSCINLFNVFVKIF 
MSFYVFHIRIGSNSVGIAIWLSIALYSAMILFEFLPSLFKSKYLIYRDKRIDNMHHVL 
KEFRLIKMFNWESFAFRYINIFRMKEMRYCRIRLYLSNIGVFISSISSDIVEWIFFI 
YLRDRLNKREEIKFTSIIMPLYVYKILISNVANFPNLVNNVMEGIVNIKRLNNYINDH 
LYYNDIKNYFMYRTRYNEDYNIWDKTFLQNENITSHDDGTSHNLKHLKNVIRNKLTN 
MFKYFFFYHKMNYHRNIINKOILSGLLRNVDDNTNKRICFOEHRSNSTYNYNSSHIHE 
KKEEYENIHNSSNSTMSNEFKERKKNNEYIIKLENCSFGLSYDNKCDNDHILKNINFN 
LKRNSLAIIIGKVGSGRSAFFHSILGDFNMTHGNLYIENFFKKMPILYVPQNSWLFMG 
NIRSMILFGNEYNPLIYKYTILQSELLNDLSTIEHGDMRYINDDHNLSKGQKVRICLA 
RALYEHYIHMHKLCTDYEKRLIQPNEILDKDLINNRNISSYNNKKSKLVNYNIPFNEN 
YLQKCLMDDNNFYLYLLDDIFTSLDPSISKKIFSNLFCREDNISFKDNCSFIISMNKS 
TLDNFLIEDILDNVQYEVNIFEIODKTLKYRGNISEYMEKNNLNITKESHWGYSNLNT 
IDYTRIKLFDEVELNHVKHSNKMIYREAYFVKGNTESVSFEIDSINREYIKKMKKKNY 
RKEHMNKNNKDNNNNNNNSNKDDHININMNDNHRNYNDINLGPNSTDDSPTVSSLGNE 
YTLDTYTSNNSDKEEIVKPLYKDTHEEFNKSSSMPFVKSSSNMINNPSNFKYEDNSSS 
FKGSISLETYLWYFQQVGFVLLTSWIFMLISIFTDEIRFVFLTMMSIISKNNKEHSD 
TILQRQVRYLEYFVILPIISLVTSGICFSMIIYGNITSAIRVHNNILYSILNAPLYIF 
YNUNLGNIINRFIIDISAFDYGFLKRIYKAFFIFFRCILSSLLIIYMIRDCIFIFPFV 
IILIYFFVFKRFSRGCREAQRLYLSCHTPLCNIYSNALSGKNIINIYKKNTYHLDVYE 
HYINNFRISYFFKWLINIWASLYIRIFILLLTTYIIMHPHLYASGIIKLYKEKNYVRI 
LSTLGYCISFSARLGVIIKFLLCDYTHIEKEMCCVQRLEEFAKISNKENASMNKENEL 
NVITTQTYKERNENISDRISAIVEYRNVSLSSIINSSQDDESRKKYGIKFENVYVSYK 
RKIPLVNGTYKYIDEEPSLKNINMYALKNQKIGIVGRSGAGKSTILLSILGLINISQG 
KITVEGRDI RT YNRKGEDS I IG I LAQ SSFVFY NWNI RT FID PY NNFTDDE I VHALKLN 
GINLGRNDLYKYMHRQDMKSNYKRIIQTSKVINQSNDNTILLTNDCIRYLSLVRLYLN 
RHRYRIILIDEIPIFNLNNSVHDELNSFLIGKMSFNYIIRNHFPNNTVLIISHHANT 
LSCCDYIYVLRRGEITYRCSYEDVKTQSELSHLLEMDD" 
rRNA 23896. .31533 

/gene-"rRNA" 

/note-"region containing small subunit, 5.8S and large 

subunit rRNA genes and spacer regions" 
gene 23896. .31533 

/gene- "rRNA" 
gene complement 31966. .32775) 

,/gene-"MALlP3.04" 
CDS complement(join(31966. .32476,32675. .32775)) 

. /gene-"MALlP3,04" 

/note- "MAL1P3 . 04 , conserved hypothetical membrane protein, 
len: 203 aa, similarity: P. falciparum chromosome 2, 
PFB0110W, 096126 predicted integral membrane protein (255 
aa), fasta scores: opt: 335, E(): 4.9.-15, (36.1% identity 
in 191 aa overlap)" 
/codon_start-l 

/product- "conserved hypothetical membrane protein, 
MAL1P3.04" ' 

/protein Jd-"CAB63559,1" 
/db_xref-"GI:6594247" 

/translation-'MKRSYTFINVTILLFLTLLLFLTYYNYDTFSKTKFNNNIKIDIN 
RFKRIIAEASEEQKYPWEEDFCLILNEEELIRPEHNDSPYLPEHYENIDKINELSINS 
TRIWKETIKRMRQNYERETDNMNHNWRDFMWHYKWANIYLYRVHKLINITLKDLTNPI 
HDKEETITTWIKWIQEDIEYFLFNLQVEWLRILTLELFYKNKE" 
miscjeature complement( 32477. ,32486) 
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76237 
76317 
77914 
77994 



82777 
82857 
85683 
85763 
89309 
89389 
94000 
94080 
109467 
109547 
110184 
110264 
110843 
110923 
111637 
111717 
112031 
112111 
112675 
112755 
113302 
113382 
113980 
114060 
114699 
114779 
115188 
115268 
115939 
116019 
116540 



76316: gap of 

77913: contlg 

77993: gap of 

80808: contlg 

80888: gap of 

82776: contig 

82856: gap of 

85682: contig 

85762: gap of 

89308: contig 

89388: gap of 

93999: contig 

94079: gap of 

109466: contig 

109546: gap of 

110183: contig 

110263: gap of 

110842: contig 

110922: gap of 

111636: contig 

111716: gap of 

112030: contig 

112110: gap of 

112674: contig 

112754: gap of 

113301: contig 

113381: gap of 

113979: contig 

114059: gap of 

114698: contig 

114778: gap of 

115187: contig 

115267: gap of 

115938: contig 

116018: gap of 

116539: contig 

116619: gap of 



unknown length 
of 1597 bp in length 
unknown length 
of 2815 bp in length 
unknown length 
of 1888 bp in length 
unknown length 
of 2826 bp in length 
unknown length 
of 3546 bp in length 
unknown length 
of 4611 bp in length 
unknown length 
of 15387 bp in length 
unknown length 
of 637 bp in length 
unknown length 
of 579 bp in length 
unknown length 
of 714 bp in length 
unknown length 
of 314 bp in length 
unknown length 
of 564 bp in length 
unknown length 
of 547 bp in length 
unknown length 
of 598 bp in length 
unknown length 
of 639 bp in length 
unknown length 
of 409 bp in length 
unknown length 
of 671 bp in length 
unknown length 
of 521 bp in length 
unknown length 



Query Match 3.4*; Score 189.2; DB 55; Length 161891; 

Best Local Similarity 40.1%; Pred. No. 1.5e-13; 

Matches 1471; Conservative 0; Mismatches 2156; Indels 42; Gaps 14; 

Qy 1800 ttaaataattattaattaaaatttatggacttttggactgtctgactaattttcagaatt 1859 

III Mill I III III llll III I II III II II 

Db 139978 TTATATAATAAATAAATAATATTTTTGTATATATGTGAAT6TATATATATATTATATATA 139919 



1860 ttattttggttttgggttttgttgaattttttagataattattttaaatattctgcataa 1919 

I I I II III II I I III III III II llll I I 
,39918 TATATATATATTATATATTTATTTATATAATTATATATATATAATATATATATATGAATA 139859 

1920 tttttctgttatttgaaaaggatgttcgaattttttttcaaaattgaaacgtttaagaat 1979 

II MM I III II I I I I II I I I II II 
.39858 TTAGATTAATAAATATTGAATATAATATTATAATATATATATATATATATATATATATAT 139799 

1980 ttttactactgcaaattcagaataagtgaatttgttttttagaaagattaaataagttag 2039 
llll I II III II I III II llll I I I II II II 

39798 ATTTATAAATATATATTATTTATGAAAATATATATTATTTAAATATAATATGTATATAAA 139739 

2040 tattacgatttttagtttg -atttggtggaaagtaatgtatgtttttgaacataattatt 2098 

III I I I III I I II I II I llll I II I III 

39738 TAATTAAAGATAAATATTGTGTATATTGATATATATAATTAGTTTGTATGAATTAATATA 139679 

2099 tgacaataattaagttttctagggaataaacggaaatatcttcttcttttttgtaaaatt 2158 

III III II II llll I I llll I 
.39678 TGAGTATATATATATTAATATTATAATATATAAATATATATAANNNNNNNNNNNNNNNNN 139619 

2159 actaatgcaagaacaaacaacgttttggggagcaaataatctagctttaagtagtcagtg 2218 

.39618 NNNWNNNNNNNNNNN^^ 139559 

2219 taactctcaaaatctggtcataacttctaggctgagtttgctgtgctacagtagtaagtc 2278 

I II I I I III I II I I I II III II I II 
39558 NNNATATATATTTATAATAATATCATCAA- ■ TTTAATTAT ATGTTAT ATAAAGTATAATA 139501 



Qy 2279 tatagaaacttacctgacaaaacgacatgacgtcagggtcgaatctacaacttttccttt 2338 

II II I I I I II I II I I I llll 
Db 139500 TAATAAATAATTAATTATTATTTATAATTTCATTATTATATTAATAATGITTGTAATTAT 139441 

Oy 2339 ttcttcaattaacatatggttgattcaagttccgatctataataatttattacgatttat 2398 

II I III II I III II II I I I I III II 
Db 139440 TTAAATATATAATATTATAATTATTATTATTTTAATGTTAATAATTATATATATCTTATA 139381 

Qy 2399 caatttcaattaccttatatcatcctattataaatataagtcagttcaattcagttttcg 2458 

II I II III I II lllll II I I I II I I I I 
Db 1393B0 ATATGATATATATAATATTTAATATTATTAATTTTAATATATATATTAAATAATTAATAT 139321 

Qy 2459 aaagttcccaa-aaattttgaattttattaaatttattccctaaaaccgaaatagttata 2517 

I I II III III I III II lllll I III I llll III 
Db 139320 ATACTTTTTAATATATTATAAATATTTATAAATAATATATATAATAATATATTATTAATA 139261 

Qy 2518 tctttcaaatttaagtttcatttttcaatccgatttcaatttcatccttttataactctc 2577 

I II I III I I I II III III II II III I I I 

Db 139260 TATTAATGTATATTATTTTTTATATTTATAAATTTTAAATATTAATTTTATATTAATATA 139201 

Qy 2578 tattatctataattacataaatttcaaattaattttgaaatatttacactttagtcccta 2637 

llll I I llll II I llll I II I I I I I I I I 
Db 139200 TATTTAATTATAATATTTATATAIAAATA1ATTAITAGAIATCTAATAATATTTTATGTT 139141 

Qy 2638 agttcaaaactataaattttcactttagaaattaatcatttttcacatctaagcatcaaa 2697 

II I II 1 1 1 1 M 1 1 | mi II llll III I II 

Db 139140 TTTTAATAATTATAAATTAT ATTATATATAATTTATAAATAATATTATAA 139091 

Qy 2698 tttaaccaaatgacacaaatttcatgattagttagatcaagcttttgagtcttcaaaaca 2757 

II llll II I lllll II I II lllll I II 

Db 139090 TAAATAATATATATAATGAAAATATTAATAGTTTTATATATATTATGAGTATATTAATTT 139031 

Qy 2758 taaaaattacaaaaaaaaaacaaacttaaaatcatttatcaatttgaacaacaaagcttg 2817 

I I II lllll I I III II I III II II llll 
Db 139030 ATATATATATAAAAATATATTAAATATATATATATTATTTTAATTATTAGTAATAGTT - - 138971 

Qy 2818 gccgaatgctaagagcttaaaaatggcttcttttgtttctttttgttgcaaacggtggag 2877" 

I III III I III II I III I II I 

Db 138972 ATAATTAAATTTATTATTAAATTATATTTAAATTTTGTTTATCTATTACTGCAATT 138917 

Qy 2878 agaagagggaaatgaagattgaccatatttttttattatgttttaacatataatattaat 2937' 
llll II III III II I III I lllll lllllll ' 
Db 138916 TATATATGTATTATTATAAAATACATTTTTAAATAAATTTTTTATATATATATTATTAAI 138857 

Qy 2938 aatttaatcataattatactttggtgaatgtgacagtggggagatacgtaaagtatttta 2997 

II llll I. I llll 
Db 138856 TATATAAT-AAATAAGTATTATATNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 138798 

Qy 2998 acattatactttttgcaagcagttggctggtctacccaagagtgatcaaagtttgagctg 3057 

III III 

Db 138797 NIMKNNNNNNNNNNNNNNNNNN^ 138738 

Qy 3058 ccttcaatgagccaatttttgcccataatggataaaggcaatttgtttagttcaactgct 3117 

I III I I II I llll I II I II I II III 
Db 138737 ATTATAATTATATTACTTATATACATATTATATGACCATATATAATATATAGATAATCAT 138678 

Qy 3118 cacagaataatgttaaaatgaaattaaaataaggtggcctggtcacacacacaa a 3172 

I I II III III I I II I II II III II I 

Db 138677 AATATGATGATGATAATAGTTATTTTACTTAGTTATCATATATATAATACATAATCATTA 138618 

Qy 3173 aaaaaactaatgttggttggttgaattttatattacggaatgtaatattatattttaaaa 3232 

I I llll II II II III lllll II II II I III III 

Db 138617 TATATGATAATATTAGTATGATTAATATTATAATAATAATTAATATTTATTATAGTAATT 138558 

Qy 3233 taaaattatgttatttagattcttaatattttggagcattccatactataatttcgtaac 3292 

II I II I II III I II M llll I I 

Db 138557 CATATIAATATAATAGTGATAAAATTATTATTTACTCGATATTATIAGTAATATAATTTA 138498 

Qy 3293 ataatattaaaatatagtaatataaagtgtaattaactttaaattacaagcataatatta 3352 

lllll I mill I II I I III II II I lllll 
Db 138497 TATAATGTTATATTATATTATATAATTTATATATCATATTATTATATAATTAATATATTT 138438 
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Db 98932 TAAATAATTTAATTTAAAATAAAATAAAATAAAMTAATAATATTTATTMTATATAATA 98991 

Qy 2162 aatgcaagaacaaacaacgttttggggagcaaataatctagctttaagtagtcagtgtaa 2221 

II II I I III I! I III I I I I I II 

Db 98992 TATTTAATTAAATTAAATTTATTATTTAATTAATTTAAAATAAAATAAAAATAATAATAT 99051 

Qy 2222 ctctcaaaatctggtcataacttctaggctgagtttgctgtgctacagtagtaagtctat 2281 

III III I II I I III I I I III 

Db 99052 TTATTATTATATAATATATTTAATTAAATTAAATTTATTATTTAATTAATTAATTTAAAA 99111 

Qy 2282 agaaacttacctgacaaaacgacatgacgtcagggtcgaatctacaacttttcctttttc 2341 

III I I I II I II I II II II II 

Db 99112 TAAAATAAAAATAATAATATTTATTATTATATAATATATTTAATTAAATTAAATTTATTA 99171 

Qy 2342 ttcaattaacatatggttgattcaagttccgatctataataatttattacgatttatcaa 2401 

II llllll II I II I II I III I II I Ml 
Db 99172 TTTAATTAATTTAAAATAAATAATAATTAAATTAATATATATTATAATTAATTAAAAATA 99231 

Qy 2402 tttcaattaccttatatcatcctattataaatataagtcagttcaattcagttttcgaaa 2461 

III I III i Mill III I I I I I I I I I II 
Db 99232 AAATAATATTCAATTATAAATTTATTAATAATTTTAAATAAATAATTAAAATAATAATAA 99291 

Qy 2462 gttcccaaaaattttgaattttattaaatttattccctaaaaccgaaatagttatatctt 2521 

II II I I III I I MM II I III III I I 

Db 99292 ATTAAATAATTTAATTAATATAAATAAACATTATAAATTAAAAAATAATTTAATAAATAT 99351 

Qy 2522 tcaaatttaagtttcatttttcaatccgatttcaatttcatccttttataactctctatt 2581 

I II II II II II I III II I I I I llll 
Db 99352 ATATATAATTATTAATAAATTTAAATACATATATTTTTAATAATAATTTCTTAATTTATT 99411 

Qy 2582 atctataattacataaatttcaaattaattttgaaatatttacactttagtccctaagtt 2641 

I I llll III I I II I II III I II I II I I llll 

Db 99412 TTATTACATTATTATAATATATATTTTTTATTTAAAAAAATATAATTAAATAAATTAATT 99471 

Qy 2642 caaaactataaattttcactttagaaattaatcatttttcacatctaagcatcaaattta 2701 

II I I II I I I II I III I I llll II II 

Db 99472 TAATAATTAAATATATTTAATAGTAATTAAATATTAAACAAATAATATAAATATTATATA 99531 

Qy 2702 accaaatgacacaaatttcatgattagttagatcaagcttttgagtcttcaaaacataaa 2761 

I II I IIIM I I II II I I II I II llll 
Db 99532 ATATTATTAATTAAATTAAAATAATAATTTATTTTATAATTATATATATATATTAATAAT 99591 

Qy 2762 aattacaaaaaaaaaacaaacttaaaatcatttatcaatttgaacaacaaagcttggccg 2821 

I I llll I III II I I lllll III II II I 
Db 99592 TAATTTAAAATATAAATTAAGAAAGATAATTTTATACTTTTATTTAATTAAATATATAGT 99651 

Qy 2822 aatgctaagagcttaaaaatggcttcttttgtttctttttgttgcaaacggtggagagaa 2881 

III I I II II I I I II II II I II II 

Db 99652 AATAAATAATTTIATGTTATTTATTATAATAATATTTATTATTTTAITTTATTTATTTAA 99711 

Qy 2882 gagggaaa tgaagattgaccatattttttta ttatgttttaaca t — a taatattaat 2937 

I II I I II llll III I III II II II III llllll 
Db 99712 TAAATAAITAATTTTATAAAATATATTIATTTTAAATTAAAATATAAACATATAATTAAT 99771 

Qy 2938 aatttaatcataattatactttggtgaatgtgacagtggggagatacgtaaagtatttta 2997 

I llll III llll III I III I I I III II III 
Db 99772 TAATTAAATATATATATATTTTTTTTAATATAATAAATAATATATTTCACATTTTAATTA 99831 

Qy 2998 acattata ctttttgcaagcagttggctggtctacccaagagtgatcaaagt 3049 

I II I I III II Ml I I Ml II I 

Db 99832 AAATAAAAATAACCATTTATTAATTAACTTAATTAATATATAAAATAAAATAATTTAATT 99891 

Qy 3050 ttgagctgccttcaatgagccaatttttgcccataatggataaaggcaatttgtttagtt 3109 

III I II I II III Mill III II Ml I 

Db 99892 GTGTAATTAAATTAAATATAAAACATTTATTAATAATTAATTATATATAATATTATATAT 99951 

' Qy 3110 caactgctcacagaataatgttaaaatgaaattaaaataaggtggcctggtcacacacac 3169 

Ml I I I II llll llll llll III I 
Db 99952 TATCTTAAATTAATTAATTTTTTTAATTATTTTAATATAATAATTATTTATTAATATTAA 100011 

Qy 3170 aaaaaaaaactaatgttggttggttgaattttatattacggaatgtaatattatatttta 3229 

llll II III III II lllll I II llllll I 
Db 100012.TTATGTAIATTTTATTTAATTGTTTAATATT--TATTATTATTTTATTTAATATATTATT 100069 



Qy 3230 aaataaaattatgttatttagattcttaatattttggagcattccatactataatttcgt 3289 

II III I II I I II Mill II II I III III I 

Db 100070 AATTTAATTAATTATTATATIATATTTAATTATTATATTATTTATAATATATTATIAATT 100129 

Qy 3290 aacataatattaaaatatagtaatataaagtgtaattaactttaaattacaagcataata 3349 

I llllll III II llll I I llll I II III I II III 
Db 100130 TAATTATTGTTTATTTATTATATTATATATTATTATTTAATTATTATTTTATTTATTATA 100189 

Qy 3350 ttaaattttgaatcaattaatttttatttctattattttaattaatttagtctatttttt 3409 

III II II I I III I I II I I Ml II II I Mil II 

Db 100190 TTATAT ATTATTAATTTATATAATATAATTTAATTATATATATATTTAATTTATT 100244 

Qy 3410 caaaataaaatttaaatctaaataaaaataatttttccttaatgttgaaacaactcatgt 3469 

I I 1 1 M 1 1 1 Mill I llllll llllll I 

Db 100245 TATATATTAATTTAATTATATATTTATTTAATTTAATTATATATTTTATTTATTAATTTA 100304 

Qy 3470 tatacttcaaaattataagtattatatttaccttgatgatttatttattagtatattaat 3529 

I I III II III I lllll I I lllll llll llllll I 
Db 100305 TTTTATTTATTAATTTAATTTAATTATATATATATTTAATTTAATTATATATATATTTAA 100364 

Qy 3530 tctgattataattatggtgggatacaatcgctttccactaaatattttaactatgattta 3589 

I I llllll III I I II I I I I llll III II 
Db 100365 TTTAATTATATATATATTTAATTTAATTGTATATATATTTAATTTATTTATATATTTATT 100424 

Qy 3590 taaatttatttcaacatcgtatatttacttattaatacataatttatcataattttatgg 3649 

III II III I Ml II Mill II I MM I MM 

Db 100425 TAATTTATTTATATATTTAATTATATATTTATTTTTATTTTTTATAACTATAAATTATTA 100484 

Qy 3650 aaattgagaccaagaaacattaagagaacaaattctataacaaagacaatttagaaaaaa 3709 

I I I II II I I llll I II II M I 
Db 100485 ATTTAATTATTAATTTTATTTTATTTACTAAATATATAATTAATTTATATATATTATTGT 100544 

Qy 3710 atgtacttttaggtaattttaagtactcttaaccaaacacaaaaattcaaatcaaatgaa 3769 

I I Ml I II I I II I I I I I I I Ml 

Db 100545 TTTAATTATTTAAATAATTCATTTTATTTAATTAATTTATATATTATTAITAATAATTCT 100604 

Qy 3770 ctaaataagataatataacatacggaacatcttacttgtaatcttacattcccataattt 3829 
llll I I I III I I II II I Ml Ml I I ' 
Db 100605 TTAATTTATAATAATTAATTGTTTAATATATATTATTATATTTAATTATTTAAATATTGT 100664 

Qy 3830 tattatgaaaaataatcttatattactcgaactaaatgttgtcacaaattattatctaaa 3889, 

II I III II III I I II III I I III III III 

Db 100665 TAATIAATTAATTTATTATATTATIATTTAATTAATTTATTATATTATTATTTAATTAAT 100724 

Qy 3890 taaagaaaaacacttaatttttataacattttttcatatatttgaaagattatattttgt 3949 

I I II III I MM II II lllll I I I III 

Db 100725 TIATATTAATATTTTAITATTTAATTTTAATTAAAATTTATTTATTATTATTTTATTTTA 100784 

Qy 3950 atatttacgtaaaaatattt 3969 

I II I lllll 
Db 100785 TATTAATATTAGTATTATTT 100804 



RESULT 13 
AC008206/C 
LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 
ORGANISM 



AC008206 161891 bp DNA HTG 08-MAR-2000 

Drosophila melanogaster chromosome 3 clone BACR03I15 (D765) RPCI-98 
03.1.15 map 96B-96B strain y; cn bw sp, *** SEQUENCING IN PROGRESS 

133 unordered pieces. 
AC008206 

AC008206.9 GI:7208834 
HTG; HTGS.PHASE1. 
fruit fly. 

Drosophila melanogaster 

Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 
Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; 
Muscomorpha; Ephydroidea; Drosophilidae; Drosophila. 
1 (bases 1 to 161891) 

Celniker,S.E., Agbayani,A., Arcaina,T,T,, Baxter, E., Blazej,R.G., 
Butenhoff,C, Champe,M., Chavez, C, Chew,M., Ciesiolka,L., 
Doyle, CM., Farfan,D.E,, Galle,R,, George, R. A., Harris, N.L., 
Hinxle,A., Hoskins,R.A,, Houston, K. A., Hummasti,S.R,, Karra,K., 



Tue Sep 5 07:22:54 2000 



us-08-984-1 



099-11. rge 



Page 14 



Qy 3612 atttacttattaatacataatttatcataattttatggaaattgagaccaagaaacatta 367: 

I 11 III III I I II Mill II I I I I II I 
Db 19219 TAATTTATAAAAATTTATATTCTCATATTTATTTATTATTAATTTAATTTATATAAATAA 192' 

Qy 3672 agagaacaaattctataacaaagacaatttagaaaaaaatgtacttttaggtaattttaa 373! 

I I I I I I Mill III II II I II II II II 

Db 19279 TATAATGATTTAATTAATTATTATATATTTATAAATTTATATATTATTGAATATTTATAT 193! 

Qy 3732 gtactcttaaccaaacacaaaaattcaaatcaaatgaactaaataagataatataacata 379: 

ii i i i i mi i mi i i ii ii mi ii 1 1 

Ob 19339 ATAATATATATATATATAGAAmTAAMTTATTTAAATAATTTTTCATAAAATTTAAAA 193! 

Qy 3792 cggaacatcttacttgtaatcttacattcccataattttattatgaaaaataatcttata 385: 

II lllll llll I II II I II III lllllllll III 
Db 19399 "AAATTTCTTAAATGTATTATTTMTAAAAAATTACTTTTTAAAAAAAATAATTTTAAT 194! 

Qy 3852 ttactcgaactaaatgttgtcacaaattattatctaaataaagaaaaacacttaattttt 391! 

I I II II It I II I I III III || I I III || I 

Db 19157 TTTTTAAAAAAAATAGTAAATAATAAAAAAAAAAAAAAAAAAAAATGAAAATTATATTAT 195: 



RESULT 11 
AC005504/C 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 
ORGANISM 



TITLE 
JOURNAL 



AUTHORS 

TITLE 

JOURNAL 



FEATURES 
source 



BASE COUNT 
ORIGIN 



AC005504 104992 bp DNA HTG 01-APR-1999 

Plasmodium falciparum chromosome 12, *** SEQUENCING IN PROGRESS 
***, 3 unordered pieces. 
AC005504 

AC005504.3 GI:45585B4 
HTG; HTGS_PHASEl , 

malaria parasite P. falciparum. 
Plasmodium falciparum 

Eukaryota; Alveolata; Apicomplexa; Haemosporida; Plasmodium. 

1 (bases 1 to 104992) 

Hyman,R,w., Fung,E,L., Qin,F. , Tamaki,T., Kurdi,O.B., Conway, A. B. 
and Davis, R.W. 

Plasmodium falciparum 3D7 chromosome 12 ' 
unpublished 

2 (bases 1 to 104992) 

Hyman,R.W,, Qin,F., Fung, E .L. , Conway,A,B, and Davis, R.W, 
Direct Submission 

Submitted (21-AUG-1998) Stanford DNA Sequencing and Technology 
Center, Stanford University, 855 California Avenue, Palo Alto, CA 
94304, USA 

On Apr 2, 1999 this sequence version replaced gi : 4337172 . 

* NOTE: This is a 'working draft' sequence. It currently 

* consists of 3 contigs, The true order of the pieces 

* is not known and their order in this sequence record is 

* arbitrary. Gaps between the contigs are represented as 

* runs of N, but the exact sizes of the gaps are unknown. 

* This record will be updated with the finished sequence 

* as soon as it is available and the accession number will 

* be preserved. 

* 1 58642: contig of 58642 bp in length 

* 58643 58842: gap of unknown length 

* 58843 91011: contig of 32169 bp in length 

* -91012 91211: gap of unknown length 

* 91212 104992: contig of 13781 bp in length, 

Location/Qualifiers 
1. .104992 

/organism- "Plasmodium falciparum" 
/db_xref-"taxon:5833" 
/chromosome- "12" 
44286 a 9326 c 9564 g 41411 t 405 others 



Query Match 3.4*; Score 189.6; DB 41; Length 104992; 

Best Local Similarity 44 .9%; Pred. No. 1.5e-13; 

Matches 979; Conservative 0; Mismatches 1179; Indels 22; Gaps 6; 

3y 1802 aaataattattaattaaaatttatggacttttggactgtctgactaattttcagaatttt 1861 
v ,111111 I lllll II I II I III I llll I I llll' 



Db 75037 AAATAAATCTTAATAAATAATTTTTTTTGATAGATTTTCTAGGATAATATGAAATATTTC 74978 

Qy 1862 attttggttttgggttttgttgaattttttagataattattttaaatattctgcataatt 192: 

llll llll III llll llll II I llll 
Db 74977 GTAAAAAAAATAATAATAAAAATGATATTTAAATATTTATATTAACTA--CAATATTAAT 749: 

Qy 1922 tttctgttatttgaaaaggatgt'tcgaattttttttcaaaattgaaacgtttaagaattt 
II I lllll III II I I II III I II II III 
Db 74919 ATTTTATTATTTATAAATTATTTATTTAAAAATATATAAATATTAATTTATGGAATTTTT 7481 



Qy 1982 ttactactgcaaattcagaataagtgaatttgttttttagaaagattaaataagttagta 2041 

I II I lllll I I I I II II I II I II I II I 
Db 74859 AAATTAATTT AAATT AATTGTTTAT - ATTT AATTATATATTT AATAAAAT AT ATTTTAAA 74801 



!042 ttacgatttttagtttgatttggtggaaagtaatgtatgtttttgaacataattatttga 210: 
II I I lllll II II II III I I II II I 
Db 74800 ATAATACACACAAAATGATTCTTAATTAAATATAAAATATTTATTTTATTTATAATGAAA 747- 

Qy 2102 caataattaagttttctagggaataaacggaaatatcttcttcttttttgtaaaattact 216: 

ii ii iii i ii n iii n i i inn i i ii i 

Db 74740 TAAATAATTTAATTTAAAATAAAATAAAATAAAAATAATAATATTTATTATTATATAATA 74681 

Qy 2162 aatgcaagaacaaacaacgttttggggagcaaataatctagctttaagtagtcagtgtaa 222: 

II II I I II I II I III I I I I I II 
Db 74680 TATTTAATTAAATTAAATTTATTATTTAATTAATTTAAAATAAAATAAAAATAATAATAT 746: 

Qy 2222 ctctcaaaatctggtcataacttctaggctgagtttgctgtgctacagtagtaagtctat 228: 

I I I II I I II I I III I I I III 

Db 74620 TTATTATTATATAATATATTTAATTAAATTAAATTTATTATTTAATTAATTAATTTAAAA 74561 



Qy 2282 agaaacttacctgacaaaacgacatgacgtcagggtcgaatctacaacttttcctttttc 2341 

III I I I II I II I II II II II 

Db 74560 TAAAATAAAAATAATAATATTTATTATTATATAATATATTTAATTAAATTAAATTTATTA 74501 



Qy 2342 ttcaattaacatatggttgattcaagttccgatctataataatttattacgatttatcaa 240: 

II llllll II I II I II I III I II I III 
Db 74500 TTTAATTAATTTAAAATAAATAATAATTAAATTAATATATATTATAATTAATTAAAAATA 744' 

Qy 2402 tttcaattaccttatatcatcctattataaatataagtcagttcaattcagttttcgaaa 24j: 
III I III I lllll III I I I II I I I I II r 
Db 74440 AAATAATATTCAATTATAAATTTATTAATAATTTTAAATAAATAATTAAAATAATAATAA 7.4.381 

Qy 2462 gttcccaaaaattttgaattttattaaatttattccctaaaaccgaaatagttatatctt 252: 
II II I I III I I llll II I III III I I 

Db 74380 ATTAAATAATTTAATTAATATAAATAAACATTATAAATTAAAAAATAATTTAATAAATAT 743: 

Qy 2522 tcaaatttaagtttcatttttcaatccgatttcaatttcatccttttataactctctatt 258: 

I II II II II II I III II I I I I llll 
Db 74320 ATATATAATTATTAATAAATTTAAATACATATATTTTTAATAATAATTTCTTAATTTATT 742i 

Qy 2582 atctataattacataaatttcaaattaattttgaaatatttacactttagtccctaagtt 264: 

I I llll III I I II I II III I III II I I I I II 

Db 74260 TTATTACATTATTATAATATATATTTTTTATTTAAAAAAATATAATTAMTAAATTAATT 74201 

Qy 2642 caaaactataaattttcactttagaaattaatcatttttcacatctaagcatcaaattta 270: 

II I I II I I I II I III I I llll II II 
,200 TAATAATTAAATATATTTAATAGTAATTAAATATTAAACAAATAATATAAATATTATATA 741- 

1702 accaaatgacacaaatttcatgattagttagatcaagcttttgagtcttcaaaacataaa 276: 

I II I lllll I I II II I I II I II llll 
,140 ATATTATTAATTAAATTAAAATAATAATTTATITTATAATTATATATATATATTAATAAT 74081 

1762 aattacaaaaaaaaaacaaacttaaaatcatttatcaatttgaacaacaaagcttggccg 282: 
I I llll I III II I I lllll III II II I 
TAATTTAAAATATAAATTAAGAAAGATAATTTTATACTTTTATTTAATTAAATATATAGT 7401 

1822 aatgctaagagcttaaaaatggcttcttttgtttctttttgttgcaaacggtggagagaa 288: 

III I I II II I I I II Nil I II II 

,020 AATAAATAATTTTATGTTATTTATTATAATAATATTTATTATTTTATTTTATTTATTTAA 73951 



1882 gagggaaatgaagattgaccatatttttttattatgttttaacat----ataatattaat 2937 

I llll II llll III I III II II II III llllll 
i960 TAAATAATTAATTTTATAAAATATATTTATTTTAAATTAAAATATAAACATATAATTAAT 73901 
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JOURNAL Science 258 (5086), 1345-1348 (1992) 

MEDLINE 93088057 
REFERENCE 8 (bases 14917 to 19517) • 

AUTHORS Lewis, D.L., Farr,C,L,, Farquhar,A.L. and Kaguni,L.S. 

TITLE Sequence, organization, and evolution of the A+T region of 
Drosophila melanogaster mitochondrial DNA 

JOURNAL Mol. Biol. Evol. 11 (3), 523-538 (1994) 

MEDLINE 94285822 
REFERENCE 9 (bases 1 to 408; 13319 to 19517) 

AUTHORS Lewis, D.L., Farr,c,L. and Raguni,L,S. 

TITLE Drosophila melanogaster mitochondrial DNA: completion of the 
nucleotide sequence and evolutionary comparisons 

JOURNAL Insect Mol. Biol. 4 (4), 263-278 (1995) 

MEDLINE 96423163 
REFERENCE 10 (bases 1 to 19517) 

AUTHORS Lewis, D.L., Farr,C,L, and Kaguni,L.S. 

TITLE Direct Submission 

JOURNAL Submitted (03-OCM995) Laurie S. Kaguni, Biochemistry Department, 
Michigan State University, East Lansing, MI 48824-1319, USA 
FEATURES Location/Qualifiers 
source 1. ,19517 

/organism- "Drosophila melanogaster" 

/organelle-'mitochondrion" 

/db_xref-"taxon:7227" 

/note- "derived from new and previously submitted 

sequences; sequence is a composite containing sequences 

obtained from different Drosophila melanogaster strains" 
tRNA . 1, .65 

/gene-"mt:ND6" 

/product-"tRNA-Ile" 

/dbjcref-"FlyBase:FBgn0013685" 
gene 1, .19517 

/gene-"mt:ND6" 

/note- "mitochondrial NADH-ubiquinone oxidoreductase chain 
6' ■ 

/allele-" " 

/db_xref-"FlyBase:FBgn0013685" 
tRNA complement (97, .165) 

/product-"tRNA-Gln" 
tRNA 171. .239 

/gene-"mt:ND6" 

/product- "tRNA-Phe" 

/db_xref-"FlyBase:FBgn0013685" 
CDS 240. .1265 

/gene-"mt:ND6" 

/codon_start-l 

/dbjcref-*FlyBase:FBgn0013685" 
/transl_table-5 

/product- "NADH dehydrogenase subunit 2" 
/protein_id-"AAC47811.1" 
/dbjcref-"GI: 1166530" 

/translation-'MFNNSSKILFITIMIIGTLITVTSNSWLGAWMGLEINLLSFIPL 
LSDNNNLMSTEASLKYFLTQVLASTVLLFSSILLMLKNNMNNEINESFTSMIIMSALL 
LRSGAAPFHFWFPNMMEGLTWMNALMLMTWQRIAPLMLISyLNIRYLLLISVILSVII 
GAIGGLNQTSLRKLMAFSSINHLGWMLSSLMISESIWLILFFFYSFLSFVLTFMFNIF 
KLFHLNQLFSWFVNSKILKFTLFMNFLSWGLPPFIjGFLPKWLVIQQLTLCNQYFMLT 
IMMMSTLITLFFYLRICYSAFMMNYFENNWIMKMNMNSINYNMYMIMTFFSIFGLFLI 
SLFYFMF" 

tRNA 1264. .1329 

/gene-"mt:ND6" 
/product- "tRNA-Trp* 
/db_xref-"FlyBase;FBgn0013685" 

tRNA complement(1322. .1383) 

/product- "tRNA-Cys" 

tRNA complement 1403. .1468) 

/product- "tRNA-Tyr" 

CDS join(1470. .1472,1474. .3009) 

/codon_start-l 

/exception-'mechanism underlying reading frame shift after 

first codon uncertain" 

/transl_table-5 

/product- "cytochrome c oxidase subunit I" 
/protein_id-"AAC47812.2" 



/db_xref-"GI:7412849" 

/translation-'MSRQWLFSTNHKDIGTLYFIFGAWAGMVGTSLSILIRAELGHPG 
ALIGDDQIYNVIVTAHAFIMIFFMVMPIMIGGFGNWLVPLMLGAPDMAFPRMNNMSFW 
LLPPALSLLLVSSMVENGAGTGWTVYPPLSAGIAHGGASVDLAIFSLHLAGISSILGA 
VNFITTVINMRSTGISLDRMPLFVWSWITALLLLLSLPVLAGAITMLLTDRNLNTSF 
FDPAGGGDPILYQHLFWFFGHPEVYILILPGFGMISHIISQESGKKETFGSLGMIYAM 
LAIGLLGFIVWAHHMFTVGMDVDTRAYFTSATMIIAVPTGIKIFSWLATLHGTQLSYS 
PAILWALGFVFLFTVGGLTGWLANSSVDIILHDTYYWAHFHYVLSMGAVFAIMAGF 
IHWYPLFTGLTLNNKWLKSHFIIMFIGVNLTFFPQHFLGLAGMPRRYSDYPDAYTTWN 
IVSTIGSTISLLGILFFFFIIWESLVSQRQVIYPIQLNSSIEWYQNTPPAEHSYSELP 
LLTN" 

tRNA 3012, ,3077 

/gene-"mt:ND6" 

/product-'tRNA-Leu" 

/dbjcref- "FlyBase : FBgn0013685 " 
CDS 3083. .3767 

/note-"TAA stop codon is completed by the addition of 3' A 

residues to the mRNA" 

/codon_start-l 

/transl.except- (pos : 3767 , aa : TERM) 
/transl_table-5 

/product- "cytochrome c oxidase subunit II" 

/protein„id-"AAC47813,l" 

/db_xref-"GI:1166532" 

/translation-'MSTWANLGLQDSASPLMEQLIFFHDHALLILVMITVLVGYLMFM 

LFFNNYVNRFLLHGOLIEMIWTILPAIILLFIALPSLRLLYLLDEINEPSVTLKSIGH 

QWYWSYEYSDFNNIEFDSYMIPTNELMTDGFRLLDVDNRWLPMNSOIRILVTAADVI 

HSWTVPALGVKVDGTPGRLNQTNFFINRPGLFYGQCSEICGANHSFMPIVIESVPVNY 

FIKWISSNNS" 
tRNA 3768. .3838 

/gene-"mt;ND6" 

/product-"tRNA-Lys" 

/dbjcref-"FlyBase:FBgn0013685" 
tRNA 3840. .3906 

/gene-"mt:ND6" 

/product-"tRNA-Asp" 

/db.xref-"FlyBase:FBgn0013685" 
CDS 3907. ,4068 J 

/gene-"int:ND6" ,~ 

/codon_start-l *4 

/db_jxr e f - "FlyBase : FBgn 0013685" 

/transl_table-5 .* 

/product-'ATPase 8" 

/protein_id-"AAC47814.1" 

/dbjcref-"GI: 1166533" 

/translation-'MPQMAPISWLLLFIIFSITFILFCSINYYSYMPNSPKSNELKNI 
NLNSMNWKW" 
CDS 4062, .4736 

/gene-"mt:ND6" 
/codon_start-l 

/db_xref-"FlyBase;FBgn0013685« 
/transl_table-5 
/product-'ATPase 6" 
/protein Jd-"AAC47815.1" 
/db.xref-"GI:1166534" 

/translation-'MMTNLFSVFDPLAIFNFSLNWLSTFLGLLMIPSIYWLMPSRYNI 
MWNSILLTLHKEFKTLLGPSGHNGSTFIFISLFSLILFNNFMGLFPYIFTSTSHLTLT 
LSLALPLWLCFMLYGWINHTQHMFAHLVPQGTPAILMPFMVCIETISNIIRPGTLAVR 
LTANMIAGHLLLTLLGNTGSSMSYMLMTFLLMAQIALLVLESAVAMIQSYVFAVLSTL 
YSSEVN" 
CDS 4736. .5524 

/gene-"mt:ND6" 
/codon_start-l 

/dbjtref-"FlyBase:FBgn0013685" 
' /transl.table-5 
/product- "cytochrome c oxidase subunit III" 
/protein_id-"AAC47816.1" 
/db_xref-"GI: 1166535" 

/translation-'MSTHSNHPFHLVDYSPWPLTGAIGAMTTVSGMVKWFHQYDISLF 
VLGNIITILTVYQWWRDVSREGTYQGLHTYAVTIGLRWGMILFILSEVLFFVSFFWAF 
FHSSLSPAIELGASWPPMGIISFNPFQIPLLNTAILLASGVTVTWAHHSLMENNHSQT 
TQGLFFTVLLGIYFTILQAYEYIEAPFTIADSIYGSTFFMATGFHGIHVLIGTTFLLV 
CLLRHLNNHFSKNHHFGFEAAAWYWHFVDWWLFLYITIYWWGG" 
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48824-1318, USA 
FEATURES Location/Qualifiers 
source 1. .4601 

/organism- " Drosophila melanogaster " 

/organelle- "mitochondrion* 

/strain-"Oregon-R" 

/db_xref-"taxon:7227" 

/devjtage- "embryo" 
gene 1/ ,4601 

/gene- "nit: or i" 

/note- "mitochondrial origin" 
/allele-"' 

/dbjtref -"FlyBase : FBgn0013687 " 
repeatjwit 650. .1022 

/gene- "nit; or i" 

/note-"repeat I-A" 

/db_xref-"FlyBase:FBgn0013687" 

/rpt_type-tandem 
repeat_unit 1023, .1360 

/gene-"mt:ori" 

/note-"repeat I-Bl" 

/dbjtref-"FlyBase:FBgn0013687" 

/rpt_type-tandem 
repeat_unit 1361. .1705 

/gene-"mt:ori" 

/note- "repeat I-C/A" 

/dbjcref-'FlyBase:FBgn0013687" 

/rpt_type-tandem 
repeatjwit 1706, .2043 

/gene- "nit :ori" 

/note-"repeat I-B2" 

/db_xref-"FlyBase:FBgn0013687" 

/rpt_type-tandem 
repeat_unit 2044. .2388 

/gene- "nit :ori" 

/note- "repeat I-C" 

/dbjtref- "FlyBase : FBgn00136 87 " 

/rpt_type-tandem 
miscjeature 2491. .2511 
■ /gene-"mt:ori* 

/note-'deoxythymidylate stretch" 

/dbjcref-"FlyBase:FBgn0013687" 
repeat_unit 2512. .2648 

/partial 

/gene-"mt:ori" 

/dbjcref-"FlyBase:FBgn0013687" 

/rpt.type-tandem 
repeatjinit 2649. .3112 

/gene="mt:ori" 

/note-"repeat II-A" 

/dbjcref-"FlyBase:FBgn0013687" 

/rpt_type-tandem 
repeat_unit 3113. .3576 

/gene- "mt: or i" 

/note-"repeat II-B1" 

/dbjcref-'FlyBase:FBgn0013687" 

/rpt_type=tandem 
repeat_unit 3577. .4040 

/gene- "nit: or i" 

/note-'repeat II-B2" 

/db_xref-"FlyBase:FBgn0013687" 

/rpt_type-tandem 
repeat_unit 4041. .4504 

/gene-"mt:ori" 

/note-'repeat II-C" 

/db_xref-'FlyBase:FBgn0013687" 

Apt-type-tandem 
miscjeature compleraent(4565. .4585) 

/note«"deoxythpidylate stretch" 
BASE COUNT 2271 a 131 c 74 g 2125 t 
ORIGIN 



Query Match ' 3.4%; 
Best Local Similarity 44.3%; 
Matches" 1063; Conservative 



Score 189,6; DB 33; Length 4601; 
Pred. No. 3.7e-13; 

); Mismatches 1309; Indels 28; Gaps 6; 



Qy 1538 ttttcgaaaaaaatttgcattgtgtttttctgaaaaatattgcattaacataatcatgca 1597 

III! Illllll I II I II I Mil III Mill I I 
Db 2203 TTTTTTAAAAAAAAATTATTTATTAAATTATACTTAATAAACTATTTTIATAATAAATTA 2262 

Qy 1598 ttctcaattttggtcaattgaacgttataaaattctctatgatatcctgatctgtttatt 1657 

II I I I III II I II II I III! II I I III 

Db 2263 TTTTATAAATAAAATTATTTAAAATAAITAATAAAAATIAIATATATATATATAIATATA 2322 

Qy 1658 acattatatg-tgtttatgcttgagttaagtcaaacattgagattcatagctcacccaat 1716 

I I III I II II I II II I I I II III I I 
Db 2323 TATTAAAATGAAAATAATTTTTAAATTTTAATAATAAATAAATTTAATAATTAATAATTA 2382 

Qy 1717 tatttaatcatttcaggcaatctgcagacttaggattggatggcgttcaggagcttggat 1776 

II llll llll I I III I I II II I I II 

Db 2383 AATAAAATCTATTCATTATTAATATTTAATTAATAATAAATAAATTTAAIAACTAATAAT 2442 

Qy 1777 tggttttctcacatcatattttattaaataattattaattaaaatttatggacttttgga 1836 

II II I II II llllll llll Mil llll llll 
Db 2443 TAAATAAAATTTATTIATTACTAATATTTAATTAATAATAAAAAATTATITTTTITTTTT 2502 

Qy 1837 ctgtctg-actaattttcagaattttattttggttttgggttttgttgaattttttaga 1894 

III I llllll llll I I I II I III I II 

Db 2503 TTTTTTTTTAATAATTTAATTAATTATTATATATITATAAATTTATATATTATTGAATAT 2562 

Qy 1895 taattattttaaatattctgcataatttttctgttatttgaaaaggatgttcgaattttt 1954 

I II II I I Illllll llllll III llll llll 

Db 2563 TTATAATATATATATATATATAGAAAAATTAAATTATTTAAATAATTTAATATAAATTTT 2622 

Qy 1955 tttcaaaattgaaacgtttaagaatttttactactgcaaattcagaataagtgaatttgt 2014 

II III II Mill llll I III lllll II I I 

Db 2623 TTAAAAATTTCTrAAATGTAITATTTTTATAAAAAATAITTATAIAAIAAAATCATGTTT 2682 

Qy 2015 tttttagaaagattaaataa-gttagtattacgatttttagtttgatttggtggaaagt 2072 

III I II I III II III II lllll III II" 
Db 2683 TTTAAAAAATAAACAAAAAATTTTTAATAAATAAATTTTATAAIGAAATATAATTTATTT 2742 

Qy 2073 aatgtatgtttttgaacataattatttgacaataattaagttttctagggaataaacgga 2132 

I I I lllll llllll III I II II III llll I II III 
Db 2743 ATTTTTCATTTTTAAAAAAAAATTTTTTAAAAAAAATAATTTTITITITAAAAAAAAACT 2802 

Qy 2133 aatatcttcttcttttttgtaaaattactaatgcaagaacaaacaacgttttggggagca 2192 

I II II I II I II II llll II 
Db 2803 ATATACTAATTATAAATTAAIAGATATITATATATATATAAATATTTAATATATTATTAT 2862 

Qy 2193 aataatctagctttaagtagtcagtgtaactctcaaaatctggtcataacttctaggctg 2252 

I I lllll II llll I III II III I II 
Db 2863 ATATCTAATAATTTAAATAAAAAATTTTAAAATTIAAAAATGTAGATATAATTTATAAAA 2922 

Qy 2253 agtttgctgtgctacagtagtaagtctatagaaacttacctgacaaaacgacatgacgtc 2312 

I II I I II I I I I I I II llll II I I 
Db 2923 ATTTATATTCTCATATTTATTTATTATTAATTTAATTTATATAAAIAATATAATAATTTA 2982 

Qy 2313 agggtcgaatctacaacttttcctttttcttcaattaacatatggttgattcaagttccg 2372 

I I I II I I I III I I III I llllll 
Db 2983 ATTAATTATTATATATTTATAAATTTATATATTATTGAATATTTATATAATATATATATA 3042 

Qy 2373 atctataataatttattacgatttatcaatttcaattaccttatatcatcctattataaa 2432 

I II III II I lllll I II II I I I II llll 
Db 3043 TATATAGAAAAATTAAATTATTTAAATAATTTAATATAAATTTITIAAAAATTTCTTAAA 3102 

Qy 2433 tataagtcagttcaattcagttttcgaaagttcccaaaaattttgaattttattaaattt 2492 

I II II I I I II I I I I III I llll 
Db 3103 TGTATTATTTTTATAAAAAATATTTATATAATAAAATCATGTTTTTTAAAAAATAAACAA 3162 

Qy 2493 attccctaaaaccgaaatagttatatctttcaaatttaagtttcatttttcaatccgatt 2552 

i i n i i i ii i i n ii minim n 

Db 3163 AAAATTTTTAATAAATAAATTTTATAATGAAATATAATTTATTTATTITTCAATTTTTTT 3222 
Qy 2553 tcaatttcatccttttataactctctattatctataattacataaatttcaaattaattt 2612 
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/protein_id- n CAB3 8 9 8 6 . 1 " 
/dbjcref«"GI:4493950" 
/db_xref - " SPTREMBL : 097 2 7 5 " 

/translation-'MLPyKLICLFLSFIIIIIHGKDISSTKKLLKSGNLNNKIKKGRE 
RYGGIYNSKyMTIEKRRKYMYIKKQKNNNNSYFSYCNVYKNNDVNNNYTAYNYYINNP 
KNKLKEYYEKIKNHVIKKKKKIFSLKPSONKRNEKKKKYFFINFTSFHDIKDNIKVLF 
NIDYIENNIIYSYIKSFKRTPPITKIYLLGTFLLSVLIHMNKNVYKLILFDFNKIFKK 
GEIWRLFTPYLYIGKLYLQYILMFNYLNIYMSSVEISHYKKPEDFLIFLTFGYISNLL 
FTIWANMYNENIMNVKLYIHNFKNFFIKDCVSKYTSRSSTNNNSNNINSNNRSSNNNN 
HYUNSKNI DIKKEQYNHLGYVFST YI LYYWSRINEGTL I NCFELFFI KAEYVPFFF I I 
QNILLYNEFSLYEVASIFSSYLFFTYEKYFKFNYLRLFFKTLLKVIHIYPLYDRFQMN 
TKLLISILKKRKNGPLPFLQKCYIHNVPRINTMKHMNISDLRKNIEVCNKKNVKHKNV 
WSNFLYIIILKLFNKEIKNYVDFMIILKLLSKYIKIEKKVLLYICEQIEHEIYKFRTR 
DLTLLILILRKNNFDNIYYINLISKSILMKMNKNMSYKDLALIIYSLSKNIYLTDEQI 
YNKEIFNFSILKFENHLNNVNINLHSLSLFFYSYSVYFINNCFYYYYYFHSFFNIITK 
FINIINKNLHLYNSTDLMFLYIGSLHIHNMYTPNHVDQNKEPKNNQKENNNYHNDNHN 
IYLKNINNNCYDHRLDSNDFITMTNYDQGEYNKHIQQNKHIQQNKHIQQNKHIQQNKH 
IQRIGTHCTESNSNNQQLIQIQNDERENRLITYDNSRHNLLKDPCQHNIVERDGERRQ 
NLI KNLIINIKKI IEEKLSSFKIQEIVNILFVSLNKNI I INKKYFHFLNQEKINIRNY 
I NI Y7NINK I YLNDEEENTSHCILK I KNDNKKDI LYHDHMKFLYNLMNE I I YRNDLLN 
MKQIILLLYGLKFNNFMFLOFEKIILKRFICLPKKEIQKIGKEEIMFLYQYFFVRTCL 
FNELKKQNNLFISQDEYENYIYISDKYNESAKLDNSYNMPSNLREKNTNHHGGKDNTL 
DLYIHDDIiFYMNKNKKRDRYKIYLYDNFIFNYPAYYVEQKKDHIDYNESVNNFDNMKS 
FIQLKKKKINKINNNNNNNNNMNNIYIDTNIQTVNKNYSCTHNOTIKNETNDNYPNS 
TIRNQHPNDQVILNNPVFFYNKKLNWDSIDFEYELTCYNLYLDIYKIVCLKLLTLLK 
NHKLSCLQSIDILCIYEKLNIRDYRIIKYLYNLKKELLYLDNTYLLKVINIIVKFNLY 
NMISYLQINKILTFINYNNINESIQILKLIGMLISVHKHNKLSPFHMNNLNVQNAANY 
LFKNLYNLQNIQDLKKIEMMNVYDNLTFKFYRLFKNILSINVKRYVQNCNSYNKYEMN 
THTNNLNKNEQHKYIHHNNDHKDGRHNNNNNHYDKVDVSSSSSSSYYYYLNKSGKNLG 
NINVQNLDDININKIKSISYKIKKDQIKDIGYMRVSKYSELMKSMKMMNYDEHFNDEY 
RNVCDEIYEDLFLIYNKNIOVYKNINICNYTFPMAINLLTLNNDENILININKSDDNK 
RLIRVDRRRFLIVDILYNYDYYYTLTRSRLDRLREYNIYLSYYSNHIRRRNKRILNYR 
KYALLKLIKKRGFNYICIDADTYVKNKKGKSKDLSYEINKLYINNLILDILKRQKKNH 
LHPHPHTQNRTTKQIKNINIKELLLYHQNKRNVKKIIHFKNYKYKIMNLPDQRNHYH 
NKRIKYIKDRSLLAINHKTRNUEKQKISTSNHLSKLKRMFSL" 

gene complement(20528 . ,21454) 

/gene-"MAL3P5.5" 

CDS complement(20528. .21454) 

/gene-"MAL3P5.5" 

/note-"predicted using hexExon; MAL3P5.5 (PFC0595C), 

Serine/threonine protein phosphatase (PP2), len: 309 aa; 

Similarity to serine/threonine protein phophatases. 

M.domestica serine/threonine protein phosphatase 

(TR:Q42912) BLAST Score: 1005, sum P(l) - 6.9e-107; 60* 

identity in 301 aa overlap." 

/codon_start-l 

/protein_id-'CAB38970.1" 

/dbjcref-"GI: 4493934 " 

/db_xref - " SPTREMBL : 0 97 2 5 9 " 

/translation-'MARGEERRWIEQLRMNPPRLLDESDLRLVCQRVREILVEENNVQ 
SIKPPVIICGDIHGQFFDLLELFDVGGDIMNNDYIFLGDYVDRGYNSVETFEYLLLLK 
LLFPKNITLLRGNHESRQITTVYGFYDECFKKYGNANAWKYCTDIFDYLTLAALVDNQ 
IFCVHGGLSPEIKLIDQLRLINRVQEIPHEGAFGDIMWSDPDEVDDWVANPRGAGWLF 
GPNVTKKFNHINNLELIARAHQLAMEGYRYMFEDSTIITWSAPNYCYRCGNVAAIMR 
IDEYMNRQMLIFRDTPDSRNSIRNKATIPYFL" 

gene 25252. .26157 

/gene-"MAL3P5,6" 

CDS join(25252. .25296,25453. .26157) 

/gene-"MAL3P5.6" 

/note- "predicted using hexExon; MAL3P5.6 (PFC0600w), 

Hypothetical protein, len: 250 aa" 

/codonjtart-1 

/protein_id-"CAB38972.1" 

/dbjcref-"GI:4493936" 

/db_xref-"SPTREMBL:097261 B 

/translation-"MRRYLNRYMYIYNIYNRLEEKYRNFLRLRNMNSHMGASQNMNVN 
NNYTMNELEEFERINNNYNNNNNNINNNINNYYDYMNIRVSQSVQHNRRLQDFYNNKN 
SFQHYIKKLKTCRFDADDIRNLLEKRLAYERDNTLIKNIQEEENKKGIGINGNFGSES 
NSSSSNYDNNYLLYRKINRLNRTNTNRSRNRSRRRRRINSRIDRRYIIRCRACKFINP 
NGFRIEDYYTCQNCGYNDFSVIRSTSPNNAD" 

gene 27547. .28290 

/gene-"HAL3P5.7" 

CDS 27547. .28290 



/gene-"MAL3P5,7" 

/note- "predicted using hexExon; MAL3P5.7 (PFC0605c), 

Hypothetical protein, len: 248 aa" 

/codon_start-l 

/protein_id- "CAB4 1709 . 1 " 

/db_xref-"GI:4725991" 

/dbjtref ■ " SPTREMBL : Q9Y011 " 

/translation-'MGGHGGLNILPQKKWNVYRRDAQYRVHYDEHRIIKEERDREIKR 

KKDEFESTISTLKKNMTKNEDSDNNYNNFYDENGEKKTTTNYCNDHINLFIDEEKELT 

AKQKKHEEFLIKKGHYIYYDKNFNTQHNSIYDKNKNAQIISDFNKMKLCERDWFLNKK 

NKNERTRDNGANFFHIQRDNISEEHNKTENINSDLSLYCNTNNYITHDKRKERKQMHY 

HIKRIIRYRQEKDRERRRRRQGREKRKPR" 
gene complement(29992. .33537) 

/gene-"MAL3P5.8" 
CDS compleraent(29992. .33537) 

/gene-"MAL3P5.8" 
' /note- "predicted using hexExon; MAL3P5.8 (PFC0610c), 

Hypothetical protein, len: 1182 aa" 

/codon_start-l 

/protein.id-"CAB38971.1" 

/dbjcref-"GI: 4493935" 

/db_xref - " SPTREMBL : 097 2 60 " 

/translation-'MAHKVKREKRTEAQETPWAKEOTHAKEENNESNIAVTEENVIS 
KNGQEIAISRNDQEIAISRNDQEIAISNNDQEIAISKNDQENVALNSSEERQNASKEE 
DNELRQIREFHDISNENEHNENRSFSTSTLSSFFREYEENSVEQHFFSEGTHTEHSME 
DSNNVETIENAITNDVLRSNRSTSYSKQRNELTSVTCYVCGETVDLNIWSDHIFAHRL 

Query Match 3.5%; Score 192.6; DB 33; Length 86829; 

Best Local Similarity 43.8%; Pred. No. 7.3e-14; 

Matches 1064; Conservative 0; Mismatches 1354; Indels 9; Gaps 5; 

Qy 1558 tgtgtttttctgaaaaatattgcattaacataatcatgca'ttctcaattttggtcaattg 1617 

II II I III II! Mil I III II I I I II II 

Db 39027 TATTTTAAAATAAAATATAAATATTTAATAAAATAATAAAAAAAAATATATGTAATAGTT 39086 

Qy 1618 aacgttataaaattctctatgatatcctgatctgtttattacattatatgtgtttatgct 1677 

I Mill III I II Mill I I 1 

Db 39087 ATATATATAATATTAAATTAATATAAATTAATATAATATAATAAATAAATAATAATATAT* 39146 

Qy 1678 tgagttaagtcaaacattgagattcatagctcacccaattatttaatcatttcaggcaa't'1737 

i i ii 1 1 1 1 i ill i i mi i i i in i r 

Db 39147 ATATTAAATAAATAAAATAAACAAAATAAATTAAATTATTTTAAATTAATTAAATAAATA* 39206 

Qy 1738 ctgcagacttaggattggatggcgttcaggagcttggattggttttctcacatcatattt 1797 
III III I I I I I III II I I III 

Db 39207 AAATATATTATTTATTAAAATAAATAAATTAATATATATTATTTATTAAAATAAAAATAA 39266 

Qy 1798 tattaaataattattaattaaaatttatggacttttggactgtctgactaattttcagaa 1857 

I I III MM MIMIII I I II I I II I II 

Db 39267 ATTAATATATATATTTATTAAAATAAAAATAAATTAATATATATTATTTATTAAAATAAA 39326 

Qy 1858 ttttattttggttttgggttttgttgaattttttagataattattttaaatattctgcat 1917 

I I II II III II II I I I I I II I I MM I 

Db 39327 AATAAATTAATATATATTATTTATTAAAATAAAAATAAATTAATATATATTATTTATTAA 39386 

Qy 1918 aatttttctgttatttgaaaaggatgttcgaattttttttcaaaattgaaacgtttaaga 1977 

III I I I I II I I Mill I I I I I 

Db 39387 AATAAAAATAAATTAATATATATTATTTATTAAAATAAAAATAAATTAATATATATTATT 39446 

Qy 1978 atttttactactgcaaattcagaataagtgaatttgttttttagaaagattaaataagtt 2037 

II I II II II III I I II I II III I I I II 

Db 39447 TATTAAAATAAAAATAAATTAATATATATTATTTATTAAAATAAAAATAAATTAATAATT 39506 

Qy 2038 agtattacgatttttagtttgatttggtggaaagtaatgtatgtttttgaacataattat 2097 

I II I I II I MM II I II I I II I II 
Db 39507 AAAATAAATAAATTAATATATATTATTAATTAATATAAATAATAAATAAATAATTAAAAT 39566 

Qy 2098 ttgacaataattaagttttctagggaataaacggaaatatcttcttcttttttgtaaaat 2157 

III I II I II I I II MM I I I I III 
Db 39567 AAATAAATTAATATATATTATTAATTAAAATAAATAATAAATAAATTAATAATTCAAATA 39626 

Qy 2158 tactaatgcaagaacaaacaacgttttggggagcaaataatctagctttaagtagtcagt 2217 

I II I II I I I II I II I I II I I II I I 
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FEATURES 
source 



BASE COOT 
ORIGIN 



» arbitrary. Gaps between the contigs are represented as 

> runs of N, but the exact sizes of the gaps are unknown. 

> This record will be updated with the finished sequence 

» as soon as it is available and the accession number will 
» be preserved. 

» 1 67262: contig of 67262 bp in length 

* 67263 67462: gap of unknown length 

* 67463 82485: contig of 15023 bp in length 

> 82486 82685: gap of unknown length 

» 82686 130281: contig of 47596 bp in length. 
Location/Qualifiers 
1. .130281 

/organism- "Plasmodium falciparum" 
/dbjcref-"taxon:5833" 
/chromosome- "12" 
/clone- "3D7" 

52250 a 11780 c 11855 g 53996 t 400 others 



Query Match 3.6%; 
Best Local Similarity 45. 71} 
Matches 1118; Conservative 



Score 197; DB 60; Length 130281; 
Pred. No. 2.1e-14; 

3; Mismatches 1290; Indels 37; Gaps 11; 



Qy 1556 attgtgtttttctgaaaaatattgcattaacataatcatgcattctcaattttggtcaat 1615 

III I III I I I II I I II II I III II III I I I I 
Db 101320 ATTTTATTTATTTCATTAAAAAAGGATAAAGACAATAATAAATTATAATAATAAAAAACA 101261 

Qy 1616 tgaacgttataaaattctctatgatatcctgatctgtttattacattatatgtgtttatg 1675 

I MM I I II II III II II II I I I II 

Db 101260 TTCCAAATATATACCCCCAAATATATATTATATATGTATAAGACTATACAAATATATAAA 101201 

Qy 1676 cttgagttaagtcaaacattgagattcatagctcacccaattatttaatcatttcaggca 1735 

II I I I II III I I I II I 

Db 101200 TATMTMTCACATATTAATATAATATATATTTATTMTTTATAAAATAAATAAAAATAT 101141 

Qy 1736 atctgcagacttaggattggatggcgttcaggagcttggattggttttctcacatcatat 1795 

II I I I III II II I II I I I II I 

Db 101140 AIATATATAATTATATAATATATCAAATTAAATCATTATAAAATTTATTTAAAATATATT 101081 

Qy 1796 tttattaaataattattaattaaaatttatggacttttggactgtctgactaattttcag 1855 

llllll! Ill Mill I II Ml I I I I I 
Db 101080 AAAATTAAATATATATATATTAATAAATAATTAAGTTAATTATTTAATAAATAAAAATAA 101021 

Qy 1856 aattttattttggttttgggttttgttgaattttttagataattattttaaatattctgc 1915 

I I MM III I I I I I MM II I I I I II 

Db 101020 TAATAAATTTAAATATTAATATAATTAAATTCATAATACACATTAATTAATAAAATATGA 100961 

Qy 1916 ataatttttctgttatttgaaaaggatgttcgaattttttttcaaaattgaaacgtttaa 1975 

III I I M I I I II Ml Ml II III M I II 

Db 100960 ATATTAATATAAATAATAAATAGAAAAATATTAATACAATTTAAATATTAAATAAATAAA 100901 

Qy 1976 gaatttttactactgcaaattcagaataagtgaatttgttttttagaaagattaaataag 2035 

I II II I II I Mil II II III II II Mill 

Db 100900 AAIATTATAAT1TATAAATAATAAAATATTAATATAAATTAA1TAAIAATATATAATAAA 100841 

Qy 2036 ttagtattacgattttt---agtttgatttggtggaaagtaatgtatgtttttgaacata 2092 

III III I I Ml II III III MM I II III 
Db 100840 TTAATATAATTAAATTTAAAATATAAATTAATAAAAAAATAATACTAATATTAATATAM 100781 

Qy 2093 attatttgacaataattaagttttctagggaataaacggaaatatcttcttcttttttgt 2152 

II I I I HIM III MM III II MM I II I I I 

Db 100780 ATAAAATAATAATAAATAAATTTTAATTAAAATTAAAT - - AATAAAATATTAATATAAAT 100723 

Qy 2153 aaaattactaatgcaagaacaaacaacgttttggggagcaaataatctagctttaagtag 2212 

II I I Mil I II III I I I I I I II I I 

Db 100722 TAATTAAATAATAATATAATAAATTAATTAAATAATAATATAATAAATTAATTAATTAAC 100663 

Qy 2213 tcagtgtaactctcaaaatctggtcataacttctaggc—tgagtttgctgtgctacag 2269 

I III I IMI I I III I II I I I I I I II II 
Db 100662 AATATTTAAATAATTAAATATAATAATATATATTAAACAATTAATTATTATAAATTAAAG 100603 

Qy , 2270 tagtaagtctatagaaacttacctgacaaaacgacatgacgtcagggtcgaatctacaac 2329 



I II II I I II I I II I I II I II III 
Db 100602 AATTATTAATAATAATATATAAATTAATTAAATAAAATGAATTATTTAAATAAITAAAAC 100543 

Qy 2330 -ttttcctttttcttcaattaacatatggttgattcaagttccgatctataataattta 2387 

I I I I I HIM I I I I II I I I III III 

Db 100542 AATAATATATATAAATTAATTATATATTTAGTAAATAAAATAAAATTAATAATTAAATTA 100483 

Qy 2388 ttacgatttatcaatttcaattaccttatatcatcctattataaatataagtcagttcaa 2447 

II I II II II II I II MM I llllll llllll 

Db 100482 ATAATTTATAGTTATAAAAAATAAAAATAAATATATAATTAAATATATAAATAAATTAAA 100423 

Qy 2448 ttcagttttcgaaagttcccaaaaattttgaattttattaaatttattccctaaaaccga 2507 

I I I I I I I I I II I I II I llllll III I I 

Db 100422 TAAATATATAAATAAATTAAATATATATACAATTAAATTAAATATATATATAATTAAATI 100363 

Qy 2508 aatagttatatctttcaaatttaagtttcatttttcaatccgatttcaatttcatccttt 2567 

II I I HIM II I II I I I I Ml I 

Db 100362 AAATATATATATAATTAAATTAAATATATATATAATTAAATTAAATTAAIAAATAAAATA 100303 

Qy 2568 tataactctctattatctataattacataaatttcaaattaattttgaaatatttacact 2627 

II I I II II llllllll II II I III I II I II II I 
Db 100302 AATTAATAAATAAAATATATAATTAAATTAAATAAATATATAATTAAATTAATATATAAA 100243 

Qy 2628 ttagtccctaagttcaaaactataaattttcactttagaaattaatcatttttcacatct 2687 

Ml II I I I II IMI I I II llllllll II II I II 
Db 100242 TAAAT • - TAAATATATAT ATAATTAAATTATATT ATAT AAATTAAT AATAT ATAATAT AA 100185 

Qy 2688 aagcatcaaatttaaccaaatgacacaaatttcatgattagttagatcaagcttttg--- 2744 

I II III MM I I I I I II III I II 

Db 100184 TAAATAAAATAATAATTAAATAATAATATATAATATAATAAATAAACAATAATTAAATTA 100125 

Qy 2745 agtcttcaaaacataaaaattacaaaaaaaaaacaaacttaaaatcatttatcaa 2799 

I I III III II Ml I II I II III II III II 
Db 100124 ATAATATATTATAAATAATATAATAATTAAATATAATATAATAATTAATIAAATIAATAA 100065 

Qy 2800 tttgaacaacaaagcttggccgaatgctaagagcttaaaaatggcttcttttgtttcttt, 2859 

I I II III III III Mil I llllll 

Db 100064 TATATTAAATAAAATAATAATAAATATTAAACAATTAAATAAAATATACATAATTAATAT; 100005 

Qy 2860 ttgttgcaaacggtggagagaagagggaaatgaagattgaccatatttttttattatgtt. l 2919 

Mill II I I HIM I III M III \: : - 
Db 100004 TAATAAATAATTATTATATTAAAATAAITAAAAAAAITAATTAATTTAAGATAATATATA499945 

Qy 2920 ttaacatataatattaataatttaatcataattatactttggtgaatgtgacagtgggga 2979 

II Mil llllll III I II II III II I I 

Db 99944 ATATTATATATAATTAATTATTAATAAATGTTTTATATTTAATTTAATTACACAAI1AAA 99885 

Qy 2980 gatacgtaaagtattttaacattatactttttgcaagcagttggctggtctacccaagag 3039 

II III I IMI III I II I II II 
Db 99884 TTATTTTATTTTATATATTAATTAAGTTAATTAATAAATGGTTATTTTTATTTTAATTAA 99825 

Qy 3040 tgatcaaagtttgagctgccttcaatgagccaatttttgcccataatggataaaggcaat 3099 

II I I I II III II II III I II II II 
Db 99824 AATGTGAAATATATTATTTATTATATTAAAAAAAATATATATATATTTAATTAATTAATT 99765 

Qy 3100 ttgtttagttcaactgctcacagaataatgttaaaatgaaattaaaataaggtggcctgg 3159 

I I I I I I I I III Ml II III III I 
Db 99764 ATATGTTTATATTTTAATTTAAAATAAAT ATATTTTATAAAATTAATTATTTATTA 99709 

Qy 3160 tcacacacacaaaaaaaaactaatgttggttggttgaattttatattacggaatgtaata 3219 

I I I MM II III II I I III MM III 
Db 99708 AATAAATAAAATAAAATAATAAATATTATTATAAIAAATAACATAAAATIATTIATTACT 99649 

Qy 3220 ttatattt--taaaataaaattatgttatttagattcttaatattttggagcattccata 3277 

IMI I llllllll III Ml llllllll II II II 
Db 99648 ATATATTTAATTAAATAAAAGTATAAAATTATCTTTCTTAATTTATATTTTAAATTAATT 99589 

Qy 3278 ctataatttcgtaacataatattaaaatatagtaatataaagtgtaattaactttaaatt 3337 

I I I II I III III I III II II I 

Db 99588 ATTAATATATATATATAATTATAAAATAAATTATTATTTTAATTTAATTAATAATATTAT 99529 

Qy 3338 acaagcataatattaaattttgaatcaattaatttttatttctattattttaattaattt 3397 

. Ill I llllll I II III III III llllll I Ml I II 
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LOCUS 140338 1283 bp DNA PAT 13-MAY- 

DEFINITION Sequence 17 from patent US 5620882, 
ACCESSION 140338 ■ 
VERSION 140338.1 GI:2082630 
KEYWORDS . 
SOURCE Unknown. 
ORGANISM Unknown, 

Unclassified. 
1 (bases 1 to 1283) 
John,M. 

Genetically engineering cotton plants for altered fiber 
Patent: US 5620882-A 17 15-APR-1997; 
Location/Qualifiers 
1. .1283 
/organism-'unknown". 
BASE COUNT 509 a 233 c 251 g 290 t 
ORIGIN • 



AUTHORS 
TITLE 
JOURNAL 
FEATURES 
source 



Query Match 5.0%; 
Best Local Similarity 84.2*; 
Matches 326; Conservative 



Score 273.4; DB 5; 
Pred. No. 2.1e-22; 
0; Mismatches 46; 



Length 1283; 

Indels 15; Gaps 1; 



Qy 4124 aatacacgttcttttctttctatttgattaaccatggctcatagcattcgtcaccctttc 4183 

I II I llllll 1 1 1 1 1 1 ! I M I 1 1 1 1 i I M 1 1 1 1 1 1 I II MM MUM 

Db 13 ACTAAAAATTCTTTGCTTTCTATTTTGTAAACCATGGCTCATAACTTTTGTCATCCTTTC 72 

Qy 4184 ttccttttccaacttttactcataagtgtctcactagtgaccggtagccacactgtttcg 4243 

IMMIIIMMIIIIIIMMI I llllllllll I I MM! IMM II III 
Db 73 TTCCTTTTCCAACTTTTACTCATTACTGTCTCACTAATAATCGGTAGTCACACCGTCTCG 132 

Qy 4244 gcagcggctcgacgtttattcgagacacaagcaacctcatcagagctcccacaattggct 4303 

MIMIIIIMI IMM IIIIIMI MIIMIMMIMM i 1 1 1 1 1 1 1 1 M 
Db 133 TCAGCGGCTCGACATTTATTCCAGACACAAACAACCTCATCAGAGCTGCCACAATTGGCT 192 

Qy 4304 tcaaaatacgaaaagcacgaagagtctgaatacgaaaagccagaatacaaacagccaaag 4363 

IMIMMMMMIMI II IIIIIMIIII Mill 

Db 193 TCAAAATACGAAAAGCACAAAGAGTCT GAATACAAACAACCAAAA 237 

Qy 4364 tatcacgaagagtactcaaaacttgagaagcctgaaatgcaaaaggaggaaaaacaaaaa 4423 

lllllllll Mill IIM llllllllll MUM llllllllllllllllll 
Db 238 TATCACGAAAAGTACCCAAAACATGAGAAGCCTAAAATGCACAAGGAGGAAAAACAAAAA 297 

Qy 4424 ccctgcaaacagcatgaagagtaccacgagtcacacgaatcaaaggagcaaaaagagtac 4483 

IIIMIMM IMMIIIMMIMMIMM llllll IMMI! MIMIM 

Db 298 CCCTGCAAACATCATGAAGAGTACCACGAGTCACGCGAATCGAAGGAGCACGAAGAGTAC 357 

Qy 4484 gagaaagaaaatctcgacgggcccgaa 4510 

II IIIIIMI I III III II 

Db 358 GATAAAGAAAAACCCGATTTCCCCAAA 384 



RESULT 6 

AC005504 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 



SOURCE 
ORGANISM 



TITLE 
JOURNAL 



AUTHORS 

TITLE 

JOURNAL 



AC005504 104992 bp DNA HTG 01-APR-1999 

Plasmodium falciparum chromosome 12, *** SEQUENCING IN PROGRESS 
***, 3 unordered pieces. 

AC005504 

AC005504.3 61:4558584 

HTG; HTGS.PHASE1. 

malaria parasite P. falciparum, 

Plasmodium falciparum 

Eukaryota; Alveolata; Apicomplexa; Haemosporida; Plasmodium. 

1 (bases 1 to 104992) 

Hyman,R,W,, Fung,E.L, , Qin,F. , Tamaki,T., Kurdi,0,B,, Conway, A. B, 
and Davis, R.w. 

Plasmodium falciparum 3D7 chromosome 12 . 
Unpublished *' 

2 (bases 1 to 104992) 

Hyman,R,w., Qin,F,, Fung, E .L. , Conway,A,B, and Davis, R.W. 
Direct Submission 

Submitted (21-AUG-1998) Stanford DNA Sequencing and Technology 
Center, Stanford University, 855 California Avenue, Palo Alto, CA 



BASE COUNT 
ORIGIN 



94304, USA 

On Apr 2, 1999 this sequence version replaced gi;4337172. 

* NOTE: This is a 'working draft' sequence, It currently 

* consists of 3 contigs. The true order of the pieces 

* is not known and their order in this sequence record is 

* arbitrary. Gaps between the contigs are represented as 

* runs of N, but the exact sizes of the gaps are unknown. 

* This record will be updated with the finished sequence 

* as soon as it is available and the accession number will 

* be preserved. 

* 1 58642: contig of 58642 bp in length 

* 58643 58842:. gap of unknown length 

* ' 58843 91011: contig of 32169 bp in length 

* 91012 91211: gap of unknown length 

* 91212 104992: contig of 13781 bp in length, 

Location/Qualifiers 
1. .104992 

/organism-'Plasmodium falciparum" 
/dbjcref-"taxon:5833" 
/chromosome- "12" 
44286 a 9326 c 9564 g 41411 t 405 others 



Query Match 3.6*; Score 197; DB 41; Length 104992; 

Best Local Similarity 45.7%; Pred. No. 2,2e-14; 

Matches 1118; Conservative 0; Mismatches 1290; Indels 37; Gaps 11; 

Qy 1556 attgtgtttttctgaaaaatattgcattaacataatcatgcattctcaattttggtcaat 1615 

Ml I III I I I III llllll III II III I I I I 
Db 72352 ATTTTATTTATTTCATTAAAAAAGGATAAAGACAATAATAAATTATAATAATAAAAAACA 72411 

Qy 1616 tgaacgttataaaattctctatgatatcctgatctgtttattacattatatgtgtttatg 1675 

I Mil I I II II III II II II I II II 

Db 72412 TTCCAAATATATACCCCCAAATATATATTATATATGTATAAGACTATACAAATATATAAA 72471 

Qy 1676 cttgagttaagtcaaacattgagattcatagctcacccaattatttaatcatttcaggca 1735 

II I I I MUM I I II I , 

Db 72472 TATAATAATCACATATTAATATAATATATATTTATTAATTTATAAAATAAATAAAAATAT 72531 

Qy 1736 atctgcagacttaggattggatggcgttcaggagcttggattggttttctcacatcatat 1795 

II I I I III II II I II II I II '''I 

Db 72532 ATATATATAATTATATAATATATCAAATTAAATCATTATAAAATTTATTTAAAATATATT 72591 

Qy 1796 tttattaaataattattaattaaaatttatggacttttggactgtctgactaattttcag 1855 

llllllll 'III Mill I II I II I I I I I 
Db 72592 AAAATTAAATATATATATATTAATAAATAATTAAGTTAATTATTTAATAAATAAAAATAA 72651 

Qy 1856 aattttattttggttttgggttttgttgaattttttagataattattttaaatattctgc 1915 

I I MM III I I I II MM llllllll 
Db 72652 TAATAAATTTAAATATTAATATAATTAAATTCATAATACACATTAATTAATAAAATATGA 72711 

Qy 1916 ataatttttctgttatttgaaaaggatgttcgaattttttttcaaaattgaaacgtttaa 1975 

IMM II I I I I I III Ml II III II I II 
Db 72712 ATATTAATATAAATAATAAATAGAAAAATATTAATACAATTTAAATATTAAATAAATAAA 72771 

Qy 1976 gaatttttactactgcaaattcagaataagtgaatttgttttttagaaagattaaataag 2035 

I II II I II I MM II II III II II Mill 

Db 72772 AATATTATAATTTATAAATAATAAAATATTAATATAAATTAATTAATAATATATAATAAA 72831 

Qy 2036 ttagtattacgattttt--agtttgatttggtggaaagtaatgtatgtttttgaacata 2092 

III III I I III II III III MM I II III 

Db 72832 TTAATATAATTAAATTTAAAATATAAATTAATAAAAAAATAATACTAATATTAATATAAA 72891 

Qy 2093 attatttgacaataattaagttttctagggaataaacggaaatatcttcttcttttttgt 2152 

III II IMM III MM Ml M MM I II I I I 

Db 72892 ATAAAATAATAATAAATAAATTTTAATTAAAATTAAAT ■ ■ AATAAAATATTAATATAAAT 72949 

Qy 2153 aaaattactaatgcaagaacaaacaacgttttggggagcaaataatctagctttaagtag 2212 

II I I MM I II III I I lllllllll 

Db 72950 TAATTAAATAATAATATAATAAATTAATTAAATAATAATATAATAAATTAATTAATTAAC 73009 

Qy 2213 tcagtgtaactctcaaaatctggtcataacttctaggc--tgagtttgctgtgctacag 2269 
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c 


25 


149.8 


2.7 2426 


3 


SDU49822 


c 


26 


149 


2.7 242893 


31 


CEY53C12 




27 


146 


2.6 15421 


33 


PFCOMPIRA 




28 


143.2 


2.6 106650 


39 


AC007708 


c 


29 


143 


2,6 145670 


50 


AC008132 




30 


142.8 


2.6 98734 


31 


PFMAL1P2 


c 


31 


142 


2.6 176552 


39 


AC004617 


c 


32 


140.8 


2.5 14433 


34 


AE001369 




33 


140.4 


2.5 108908 


33 


PFMAL3P8 


c 


34 


139.8 


2.5 175516 


60 


AC006280 


c 


35 


139.6 


2,5 174427 


31 


PFMAL1P4 




36 


139.2 


2,5 145670 


50 


AC008132 




37 


139 


2.5 14001 


33 


PFCOMPIRB 




38 


139 


2.5 14211 


34 


AE001368 




39 


138.8 


2.5 2426 




SDU49822 




40 


137.6 


2.5 153098 


33 


PFMAL3P2 




41 


137.4 


2.5 115218 


10 


HS159A1 




42 


137,4 


2.5 176552 


39 


AC004617 


c 


43 


137.4 


2.5 207957 


11 


AC004470 


c 


44 


136.8 


2.5 153418 


60 


AC004153 


c 


45 


136,4 


2.5 146285 


39 


AC005083 



U49822 Saccharomyc 
Z92859 Caenorhabdi 
X95275 P.falciparu 
AC007708 Homo.sapi 
AC008132 Homo sapi 
AL031745 Plasmodiu 
AC004617 Homo sapl 
AE001369 Plasmodiu 
AL034560 Plasmodiu 
AC006280 Plasmodiu 
AL031747 Plasmodiu 
AC008132 Homo sapi 
X95276 P.falciparu 
AE001368 Plasmodiu 

U49822 Saccharomyc 
AL034558 Plasmodiu 
AL034397 Human DNA 
AC004617 Homo sapi 
AC004470 Homo sapi 
AC004153 Plasmodiu 
AC005083 Homo sapi 



RESULT 1 

GBD34401 

LOCUS 

DEFINITION 
ACCESSION 
VERSION 
KEYWORDS 
SOURCE 
ORGANISM 



TITLE 
JOURNAL 



AUTHORS 
TITLE 



FEATURES 
source 



GBU34401 1699 bp DNA PLN 01-JAN-1996 

Gossypium barbadense FbLate-2 gene, complete cds. 

U34401 

U34401.1 61:1143223 

sea -is land cotton. 
Gossypium barbadense 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; 
euphyllophytes; Spermatophyta; Magnoliophyta; eudicotyledons; core 
eudicots; Rosidae; eurosids II; Malvales; Malvaceae; Gossypium. 

1 (bases 1 to 1699) 

Rinehart,J., Petersen/M. and John,M.E. 

Tissue-specific and Developmental Regulation of Cotton mRNA, 

FbLate-2: Promoter Studies in Transgenic Plants 

Unpublished 

2 (bases 1 to 1699) 
John,M.E. 

Direct Submission 

Submitted (21-AUG-1995) Maliyakal E. John, Fiber Technology, 
Agracetus, 8520 University Green, Middleton, Wl 53562, USA 

Location/Qualifiers 

1. .1699 

/organism- "Gossypium barbadense" 

/strain-"Sea Island" 

/db_xref-"taxon:3634" 

/clone- "FbL2-82A" 

369. .1585 

/gene-"FbLate-2" 

369. .1585 

/gene-"FbLate-2" 

379. .1380 

/gene-"FbLate-2" 

/codon_start-l 

/protein_id- , AAA84881,l" 

/dbjtref-"GI: 1143224" 

/translation-'MIGSHTVSTAARRLFETQTTSSELPQLASKYERQEESEYERPEY 



EKHEVEY PE I PEYKEKQDEG KEHKHEECHKSHESKEHEEYEKEKPNFPKGE KPKEHEK 



BASE COUNT 
ORIGIN • 



polyA_signal 1448. .1454 

/gene-"FbLate-2" 
661 a 328 c 328 g 



Query Match 9.5%; Score 523.8; DB 7; Length 1699; 

Best Local Similarity 89.5*; Pred. No. 1.5e-50; 

Matches 599; Conservative 0; Mismatches 62; Indels 8; Gaps 

Qy 3850 tattactcgaactaaatgttgtcacaaattattatctaaataaagaa-aaacacttaat 3907 

linn ii i mi i inn mini in in n i mi 

Db 1 TATTACCTGAGCCAAATGCTCTCACAAACTATTATCCAAAAAAAAAATGHGAATATAAT 60 
Qy 3908 ttttataacattttttcatatatttgaaagattatattttgtatatttacgtaaaaatat 3967 

1 1 m 1 1 1 ! 1 1 m i m r 1 1 m 1 1 1 1 1 1 iimiimiiimiiiiiimiimiii 

Db 61 TTTTATAACATTTTTTCATATATTTGCAAGATTATATTTTGTATATTTACGTAAAAATAT 120 
Qy 3968 ttgacatagattgagcaccttcttaacataatcccaccataagtcaagtatgtagatgag 4027 

' imimiiim r i n r 1 1 m 1 1 1 1 1 1 n f 1 1 1 1 m 1 1 u i r 1 1 1 1 r n e 1 1 1 e i m 

Db 121 TTGACATAGATTGAACACCTTCTTAACATAATCCCACCATAAGTCAAGTATGTAGATGAG 180 

Qy 4028 aaattggtacaaacaacgtggggccaaatcccaccaaaccatctctcattctctcctata 4087 
I M I ! r I M ! 1 1 ! ! 1 1 M II 1 1 1 ! fl 1 1 1 1 M 1 1 1 1 1 M 1 f M 11 1 1 II 1111111111 

Db 181 AAATTGGTACAAACAACGTGGGGCCAAATCCCACCAAACCATCTCTCATCCTCTCCTATA 240 
Qy 4088 aaaggcttgctacacatagacaacaatccacacacaaatacac gttcttttcttt 4142 

mini i iimm nimiiimmiiiiiiiii mm mi 

Db 241 AAAGGCTAGTTACACATACACAACAATCCACACACAAATACACTCAAAATTCTTTGCTTT 300 

Qy 4143 ctattt-gattaaccatggctcatagcattcgtcaccctttcttccttttccaactttta 4201 

Db 301 GTATTTCGGTTAACCATGGCTCM 360 

Qy 4202 ctcataagtgtctcactagtgaccggtagccacactgtttcggcagcggctcgacgttta 4261 

ism 1 1 1 1 1 1 1 1 1 1 1 1 iii iiiinmm ii iii iiiiiimiiiiim 

Db 361 CTCATTAGTGTCTCACTAATGATCGGTAGCCACACCGTCTCGACAGCGGCTCGACGTTTA 420 
Qy 4262 ttcgagacacaagcaacctcatcagagctcccacaattggcttcaaaatacgaaaagcac 4321 

1 1 1 1 1 1 1 1 1 1 1 1 Minimi in i mini mimiiiimiinii 

Db 421 TTCGAGACACAAACAACCTCATCGGAGTTGCCACAATTAGCTTCAAAATACGAAAAGCAG 480 
Qy 4322 gaagagtctgaatacgaaaagccagaatacaaacagccaaagtatcacgaagagtactca 4381 

minimum mini i iiinimiiiiimim minimi i.: 

Db 481 GAAGAGTCTGAATATGAAAAGCCGGAATACAAACAGCCAAAGTATGACGAAGAGTACCCA" 540 
Qy 4382 aaacttgagaagcctgaaatgcaaaaggaggaaaaacaaaaaccctgcaaacagcatgaa 4441 

mi iiiiiii i ii 1 1 1 1 1 1 m 1 1 1 1 1 ! m 1 1 1 1 inn ii nun 

Db 541 AAACATGAGAAGCCTGAAATTCACAAGGAGGAAAAACAAAAACCGTGCAAGCAACATGAA 600 
Qy 4442 gagtaccacgagtcacacgaatcaaaggagcaaaaagagtacgagaaagaaaatctcgac 4501 

1 1 1 1 m i m m 1 1 1 1 1 1 1 mi iimm iiiimi miiimi i m 

Db 601 GAGTACCACGAGTCACACAAATCGAAGGAGCACGAAGAGTACCAGAAAGAAAAACCCGAG 660 

Qy 4502 gggcccgaa 4510 

III II 

Db 661 TTCCCCAAA 669 



RESULT 2 
118362 

LOCUS 118362 1283 bp DNA 

DEFINITION Sequence 17 from patent US 5495070. 

ACCESSION 118362 

VERSION 118362.1 61:1598717 

KEYWORDS 

SOURCE Unknown, 

ORGANISM Unknown, 

Unclassified. 
REFERENCE 1 (bases 1 to 1283) 

AUTHORS John,M. 

TITLE Genetically engineering cotton plants 
JOURNAL Patent: US 5495070-A 17 27-FEB-1996; 
FEATURES Location/Qualifiers 
source 1. .1283 

/organism-'unknown" 
BASE COUNT 509 a 233 c 251 g 290 t 



for altered fiber 
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BASE COUNT 190 a 
ORIGIN 



137 c 192 g 193 t 1 others 



Query Match 33 . 3%; 

Best Local Similarity 71. H; 
Matches 414; Conservative 



Score 303.4; DB 44; 
Pred. No. 2,6e-63; 
0; Mismatches 167; 



Length 713; 

Indels 1; Gaps 1; 



Qy 5 aacaatgagcactgcaagatttatcaagtgtgtcacggtcggtgatggagctgtggggaa 64 

II llllll II I II II II I II III UNI I II Mill II llll'll II 

Db 119 AAAGATGAGCGCTACTAGGTTCATCAAGTGCGTCACTGATGGGGATGGTGCTGTGGGCAA 178 

Qy 65 aacttgtatgctcatttcatataccagcaatactttcccaacggattatgttccaacagt 124 

III III MM Mill II llllll III! II I llllll II II II 

Db 179 AACCTGTTTGCTTATTTCCTACACCAGCAACACTTTTCCCAGCGATTATGTGCCGACTGT 238 

Qy 125 atttgataactttagtgccaatgtggtggtggatggcagcacagtgaaccttggcctatg 184 

nil ii ii ii ii mum ii iim mi inn n n i i 

Db 239 TTTTGACAATTTCAGCGCAAATGTGGTTGTCAATGGGAGCATTGTGAATCTGGGTTTGAG 298 
Qy 185 ggacactgccgggcaagaagattataataggctaaggccactgagttatagaggagctga 244 

in inn ii inn miim n mi n mini i n n n 

Db 299 GGATACTGCTGGACAAGAGGATTATAACAGATTAAGACCTTTGAGITACCGTGGTGCCGA 358 

Qy 245 tgtgtttttgttggccttttctcttataagcaaggccagttatgaaaacatctacaaaaa 304 

III II' I Mil II lllll linilllllllll llllll III I III 
Db 359 TGTTTTCATACTGGCTTTCTCTCTCATAAGCAAGGCCAGGIATGAAAATGTCTCTAGAAA 418 

Qy 305 gtggatcccagagctaagacattatgctcataatgtaccagttgtgcttgttggaaccaa 364 

llllll II III I I lllllll I III I II I II lllll II II 
Db 419 GTGGATTCCGGAGTTGAAGCATTATGCTGCTGGTGTGCGCATTATTCTGGTTGGCACAAA 478 

Qy 365 actagatttgcgagatgacaagcagttcctcattgatcaccctggagcaacaccaatatc 424 

II II I II lllll lllllll llllll II Mil II Ml II I 

Db 479 GCTTGACCTTCGGGATGATAAGCAGTTCTGCATTGACCATNCTGGTGCCGTACCTATGAC 538 

Qy 425 aacatctcagggagaagaactaaagaagatgataggagcagttacttatatagaatgcag 484 

III Mil lllllll I! I IIM Mil II I II II llllllll 

Db 539 CACAGCTCATGGAGAAGAGCTTAGGAAGCTGATTAATGCGCCAGCATACATTGAATGCAG 59B 

Qy 485 ctccaaaacccaacagaatgtgaaggctgttttcgatgctgcaataaaagtagctttgag 544 

II llll II MM lllllll! II II MM! II III III I I 

Db 599 ATCACAAACACAGGAGAACGTGAAGGCAGTCTTTGATGCAGCCATATGAGTTGTCCTTCA 658 

Qy 545 gccaccaaaaccaaagagaaagccttgcaaaaggagaacatg 586 

III I II II III I I I! llll I llllll 

Db 659 TCCA'CGTAAGCAGGAAGAGAACGAAGCTAAAGCACAACATG 699 



RESULT 15 

AW688369 

LOCOS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 
ORGANISM 



AUTHORS 
TITLE 



JOURNAL 
COMMENT 



AW688369 625 bp mRNA EST 17-APR-2000 

NF006F02ST1F1000 Developing stem Medicago truncatula cDNA clone 
NF006F02ST 5', mRNA sequence. 
AW688369 

AW688369.1 GI:7563105 
EST. 

barrel medic. 
Medicago truncatula 

Eukaryota; Viridiplantae; Embryophyta; Tracheophyta; Spermatophyta; 
Magnoliophyta; eudicotyledons; Rosidae; eurosids I; Fabales; 
Fabaceae; Papilionoideae; Medicago. 
1 (bases 1 to 625) 

Tor res -Jerez, I,, Scott,A,D., Harris, A. R., Gonzales ,R. A., Bell, C.J./ 

Flores,H.R., Inman,j.T,, weller,J.W. and May,G,D. 

Expressed Sequence Tags from the Samuel Roberts Noble Foundation ■ 

Center for Medicago Genomics Research 

Unpublished (2000) 

On May 20, 1999 this sequence version replaced gi: 4878271. 

Contact: Dixon RA 

Plant Biology Division 

The Samuel Roberts Noble Foundation 

2510 Sam Noble Parkway, Ardmore, OK 73402, USA 



Tel: 580 221 7302 
Fax: 580 221 7380 
Email: radixon@noble.org 
Insert Length: 625 Std Error: 0.00 
Plate: 006 row: F column; 02 
Seq primer: TCACACAGGAAACAGCTATGAC. 
FEATURES Location/Qualifiers 
source 1. .625 

/organ ism- "Medicago truncatula" 
/db_xref-"taxon;3880" 
/clone-"NF006F02ST" 
/clone_lib- "Developing stem" 
/tissue_type- n stem" 
/dev_stage- " Pooled developmental « 
/note- "Vector: Lambda Zap; Contains a mixture of 
internodal stem segments" 
a 103 c 139 g 191 t 2 others 



BASE COUNT 
ORIGIN 



Query Match 32.9%; 
Best Local Similarity 78.9%; 
Matches 367; Conservative 



Score 299; DB 79; Length 625; 
Pred. No. 3e-62; 

); Mismatches 97; Indels 1; Gaps 1; 



Qy 1 aaaaaacaatgagcactgcaagatttatcaagtgtgtcacggtcggtgatggag-ctgtg 59 

I! I I llllll M II lllll lllll lllll II II II lllll I Mil 
Db 161 AACAGAAAATGAGTACAGCTAGATTCATCAAATGTGTTACTGTTGGAGATGGTGCCTGTT 220 

Qy 60 gggaaaacttgtatgctcatttcatataccagcaatactttcccaacggattatgttcca 119 ■ 

ii ii minimi ii ii ii ii mum inn imiiimi n 

Db 221 GGAAAGACTTGTATGCTTATCTCTTACACAAGCAATACATTCCCTACGGATTATGTGCCT 280 
Qy 120 acagtatttgataactttagtgccaatgtggtggtggatggcagcacagtgaaccttggc 179 

ii ii miim i! iiiii nm inn 1 1 m m 1 1 1 1 1 1 1 1 ii inn 

Db 281 ACTGTTTTTGATAATTTCAGTGCAAATGTTGTGGTTGATGGCAGCACAGTTAATCTTGGA 340 

Qy 180 ctatgggacactgccgggcaagaagattataataggctaaggccactgagttatagagga 239 

limilllllll II lllll llllllll Mil! llllll llll MMIIIM 
Db 341 TTATGGGACACTGCTGGACAAGAGGATTATAACAGGCTTAGGCCATTGAGCTATAGAGGA 400 

Qy 240 gctgatgtgtttttgttggccttttctcttataagcaaggccagttatgaaaacatctac 299 

Db 401 GCAGATGTGTTTTTGTTGGCCm 460 

Qy 300 aaaaagtggatcccagagctaagacattatgctcataatgtaccagttgtgcttgttgga 359 

iiinmiii ii ii ii imiimiiii i Mimi Milium in 

Db 461 AAAAAGTGGATTCCTGAACTCAGACATTATGCTCCAACTGTACCAATTGTGCTTGTGGGA 520 
Qy 360 accaaactagatttgcgagatgacaagcagttcctcattgatcaccctggagcaacacca 419 

iiiiMii mm i ii i! i iiii! i innm n mm in i 

Db 521 ACCAAACTTGATTTGAGGGAAGATAGGCAGTATTTGATTGATCATCCAGGAGCTACAGCT 580 

Qy 420 atatcaacatctcagggagaagaactaaagaagatgataggagca 464 

II I II I I! II II! I II llll I II II II! 
Db 581 ATTACTACTGCCCANGGTGAANAGCTGAAGAGGGCAATTGGTGCA 625 
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minium! iiiiiiiiiinii iimiii inn i n iiuiiiin 

Db 299 TATGGGACACTGCAGGGCAAGAAGATTACAATAGGCTGAGGCCTTTAAGCTATAGAGGAG 358 
Qy 241 ctgatgtgtttttgttggccttttctcttataagcaaggccagttatgaaaacatctaca 300 

iiiiiiiiiiiiiiin ii nun ii urn minimi mini n 

Db 359 CTGATGIGTTITTGTTGTGCTATTCTCTCATCAGCAAAGCCAGTTATGAGAACATCTCCA 418 
Qy 301 aaaagtggatcccagagctaagacattatgctcataatgtaccagttgtgcttgttggaa 360 

minium ii inn i ii ii 1 1 1 1 1 1 1 1 mi n i mn n mi 

Db 419 AAAAGTGGATACCTGAGCTGAGACATTATGCTCCANATGTGCCTATAGTGCTGGTGGGAA 478 
Qy 361 ccaaactagatttgcgagatgacaagcagttcctcattgatcaccctggagcaacaccaa 420 

i iimiii! Milium mi i n mini n m i m n 

Db 479 CAAAACTAGATNT6CGAGATGANCAGCAATNTCTGATTGATCATCCGGGATCCGCACGAA 538 
Qy 421 tatcaacatctcaggga 437 

ii mi iiim i 

Db 539 TAACAACTGCTCAGGCA 555 



RESULT 12 
AI900170 

LOCOS AI900170 658 bp mRNA EST 06-DEC-1999 

DEFINITION sc01gl2.yl Gm-cl012 Glycine max cDNA clone GENOME SYSTEMS CLONE ID: 
Gm-cl012-959 5' similar to SW:RAC1_PEA Q35638 RAOLIKE GTP BINDING 
PROTEIN RHOl. ;, mRNA sequence. 
ACCESSION AI900170 

AI900170.1 61:5606072 
EST, 

SOURCE soybean. 
ORGANISM Glycine max 

Eukaryota; Viridiplantae; Embryophyta; Tracheophyta; Spermatophyta; 
Magnoliophyta; eudicotyledons; Rosidae; eurosids I; Fabales; 
Fabaceae; Papilionoideae; Glycine. 
REFERENCE 1 (bases 1 to 658) 
AUTHORS Shoemaker ,R., Keim,P. , Vodkin,L., Erpelding, J. , Coryell, V,, 

Khanna,A., Bolla,B., Marra,M., Hillier,L. , Kucaba,!., Martin, J., 
Beck,C, Wylie,T., undervood,K., Steptoe,M., Theising,B., Allen, M., 
Bowers, Y., Person, B., Swaller,!., Gibbons, M., Pape,D., Harvey, N. , 
Schurk,R., Hitter, E. ( Kohn,S., Shin,T., Jackson, Y., Cardenas, M., 
McCann,R., Waterston,R, and Wilson, R, 
TITLE Public Soybean EST Project 
JOURNAL Unpublished (1999) 

On Oct 30, 1998 this sequence version replaced gi:3812130. 
Contact: Shoemaker R/Public Soybean EST Project 
Public Soybean EST Project 
Washington University School of Medicine 
4444 Forest Park Parkway, Box 8501, St. Louis, MO 63108, USA 
Tel: 314 286 1800 
Fax: 314 286 1810 
Email: est8watson.wustl.edu 

This clone is available through: Genome Systems, Inc. 4633 World 
Parkway Circle St. Louis, Missouri 63134 For further information 
call: (800) 430-0030 or (314) 427-3222 FAX: (888) 919-3324 or (314) 
427-3324 or contact: clones3genomesystems.com or 
infoSgenomesystems.com web site: www.genomesystems.com 
Possible reversed clone: similarity on wrong strand 
Seq primer: -40RP from Gibco 
High quality sequence stop: 401. 
Location/Qualifiers 
source 1. .658 

/organism- "Glycine max" 
/db_xref-"taxon:3847" 

/Clone- "GENOME SYSTEMS CLONE ID: Gm-Cl012-959" 
/clone_lib-"Gm*cl012" 

/tissue.type- "Apical shoot tips, 9-10 day old etiolated 
seedlings" 

/lab_host-"XL10-Gold" 

/note- "Vector: pBluescript II XR; Site_l: EcoRI; Slte_2: 
Xhol; This cDNA library was constructed from mRNA isolated 
from the apical shoots of 9 to 10 day old etiolated 
seedlings . The shoot tips including any emerged- leaves 
were harvested for mRNA isolation. The cDNA library was 



prepared using the Stratagene pBluescript II XR cDNA 
library construction kit. Complementary DNA was 
synthesized from mRNA using a primer consisting of a poly 
(dT) sequence with a Xhol restriction site. EcoRI adapters 
were ligated to the blunt-ended cDNA fragments followed by 
Xhol digestion. The CDNA fragments were directionally 
cloned into the EcoRI-XhoI restriction site of the 
pBluescript vector. The ligated cDNA. fragments were 
transformed into XLIO-Gold host cells. This library was 
constructed by Dr. Randy Shoemaker and Dr. John 
Erpelding," 

BASE COUNT 176 a 125 c 171 g 184 t 2 others 
ORIGIN 



Query Match 33.6%; Score 305.6; DB 45; Length 658; 

Best Local Similarity 73.9*; Pred, No. 7.7e-64; 

Matches 386; Conservative 0; Mismatches 136; Indels 0; Gaps ( 

Qy 5 aacaatgagcactgcaagatttatcaagtgtgtcacggtcggtgatggagctgtggggaa 64 

ii mm ii i ii ii nmiii mini n in iimiii ii 

Db 136 AAAGATGAGCGCTTCTAGGTTCATCAAGTGCGTCACTGTTGGGGATGGTGCTGTGGGCAA 195 . 

Qy 65 aacttgtatgctcatttcatataccagcaatactttcccaacggattatgttccaacagt 124 - 

III II llll lllll II llllllll llllllll II MUM II II II 
Db 196 AACCTGCTTGCTTATTTCCTACACCAGCAACACTTTCCCCACCGATTATGTGCCGACTGT 255 

Qy 125 atttgataactttagtgccaatgtggtggtggatggcagcacagtgaaccttggcctatg 184 

lllll II II lllll llllllll II llll llll lllll II II I II 
Db 256 TTTTGACAATTTCAGTGCAAATGTGGTTGTCAATGGGAGCATTGTGAATCTGGGTTTGTG 315 

Qy 185 ggacactgccgggcaagaagattataataggctaaggccactgagttatagaggagctga 244 

III lllll II lllll lllll II II llll II lllllll I II II II 
Db 316 GGATACTGCTGGACAAGAGGATTACAACAGATTAAGACCTTTGAGTTACCGTGGTGCCGA 375 

Qy 245 tgtgtttttgttggccttttctcttataagcaaggccagttatgaaaacatctacaaaaa 304 

III II I lllll II lllll lllllllllllllllllllllll III lllll " 
Db 376 TGTTTTCATATTGGCTTTCTCTCTCATAAGCAAGGCCAGTTATGAAAATGTCTCTAAAAA 435^ 

Qy 305 gtggatcccagagctaagacattatgctcataatgtaccagttgtgcttgttggaaccaa 364,.- 

iiiiii mm 1 1 iimm i i in n n i n mn n i ■« 

Db 436 GTGGATTCCAGAGTTGAAGCATTATGCACCTGGTGTCCCCATTATTCTGGTTGGCACANA 495,.. 

Qy 365 actagatttgcgagatgacaagcagttcctcattgatcaccctggagcaacaccaatatc 424 

II II I II lllll llllllll III II II llll II III II I 

Db 496 GCTTGACCTTCGGGATGATTAGCAGTTCTGCATCGACCATTCTGGTGCCGTACCTATTAC 555 

Qy 425 aacatctcagggagaagaactaaagaagatgataggagcagttacttatatagaatgcag 484 

III llll llllllll II I llll llll III llll II llllllll 

Db 556 CACAGCTCANGGAGAAGAGCTTAGGAAGCTGATTTATGCACCAGCTTACATTGAATGCAG 615 

Qy 485 ctccaaaacccaacagaatgtgaaggctgttttcgatgctgc 526 

lllllll II III! Ill I II II II lllll II 
Db 616 TTCAAAAACACAGGAGAACGTGGATGCAGTCTTTGATGCAGC 657 



RESULT 13 
AI90U41 

LOCUS AI901141 • 549 bp mRNA EST 06-DEC-1999 

DEFINITION sc21bl2.yl Gm-cl013 Glycine max cDNA clone GENOME SYSTEMS CLONE ID: 
Gm-cl013-1272 5' similar to SW:RACD_GOSHI Q41253 RAC-LIKE GTP 
BINDING PROTEIN RAC13. ;, mRNA sequence. 
ACCESSION AI 9 01141 

AI901141.1 61:5607043 
EST. 



ORGANISM Glycine max 

Eukaryota; Viridiplantae; Embryophyta; Tracheophyta; Spermatophyta; 

Magnoliophyta; eudicotyledons; Rosidae; eurosids I; Fabales; 

Fabaceae; Papilionoideae; Glycine. 
REFERENCE 1 (bases 1 to 549) 
AUTHORS Shoemaker, R., Keim,P., Vodkin,L., Erpelding, J., Coryell, V., 

Khanna,A., Bolla,B., Marra,M., Hillier,L,, Kucaba, T. , Martin, J. , 
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ACCESSION 
VERSION 



AI900160 

AI900160.1 GI:5606062 



TITLE 
JOURNAL 
COMMENT 



SOURCE 
ORGANISM Glycine max 

Eukaryota; Viridiplantae; Embryophyta; Tracheophyta; Spermatophyta; 
Magnoliophyta; eudicotyledons; Rosidae; eurosids I; Fabales; 
Fabaceae; Papilionoideae; Glycine. 
REFERENCE 1 (bases 1 to 688) 
AUTHORS Shoemaker, R., Rein, P., Vodkin,L., Erpelding, J. , Coryell, v., 

Khanna,A,, Bolla,B., Marra,M., Hillier,L. , Kucaba,T,, Martin, J., 
Beck,C, Wylie,T., UnderwoodX, Steptoe,M,, Theising,B., Allen, M., 
Bowers J., Person, B., Swaller,!., Gibbons, M., Pape,D., Harvey, N., 
Schurk,R., Ritter,E., Kohn,S. , Shin , T . , Jackson, Y., Cardenas, M., 
McCann,R,, Waterston,R, and Wilson, R. 
Public Soybean EST Project 
unpublished (1999) 

On Jul 30, 1997 this sequence version replaced gi: 2286374. 
Contact: Shoemaker R/Public Soybean EST Project 
Public Soybean EST Project 
Washington university School of Medicine 
4444 Forest Park Parkway, Box 8501, St. Louis, MO 63108, USA 
Tel: 314 286 1800 
Fax: 314 286 1810 
Email: est8watson.wustl.edu 

This clone is available through: Genome Systems, Inc. 4633 World 
Parkway Circle St. Louis, Missouri 63134 For further information 
call: (800) 430-0030 or (314) 427-3222 FAX: (888) 919-3324 or (314) 
427-3324 or contact: clones8genomesystems.com or 
info8genomesystems.com web site: www.genofflesystems.coni 
Possible reversed clone: similarity on wrong strand 
Seq primer: -40RP from Gibco 
High quality sequence stop: 431. 
Location/Qualifiers 



FEATURES 
source 



/organ ism- "Glycine max" 
/dbjcref-"taxon:3847" 

/Clone- "GENOME SYSTEMS CLONE ID: Gm-cl012-936" 
/clone_lib-"Gm-cl012" 

/tissue_type- "Apical shoot tips, 9-10 day old etiolated 
seedlings" 

/lab_host-"XL10-Gold" 

/note- "Vector: pBluescript II XR; SiteJ: EcoRI; SiteJ: 
Xhol; This cDNA library was constructed from nRNA isolated 
from the apical shoots of 9 to 10 day old etiolated 
seedlings. The shoot tips including any emerged leaves 
were harvested for mRNA isolation, The cDNA library was 
prepared using the Stratagene pBluescript II XR cDNA 
library construction kit, Complementary DNA was 
synthesized from mRNA using a primer consisting of a poly 
(dT) sequence with a xhol restriction site. EcoRI adapters 
were ligated to the blunt-ended cDNA fragments followed by 
Xhol digestion, The cDNA fragments were directionally 
cloned into the EcoRI-XhoI restriction site of the 
pBluescript vector. The ligated cDNA fragments were 
transformed into XLIO-Gold host cells. This library was 
constructed by Dr. Randy Shoemaker and Dr, John 
Erpelding," 

191 a 132 c 179 g 184 t 2 others 



Query Match 33.8%; Score 307.2; DB 45; Length 688; 

Best Local Similarity 74.1*; Pred. No. 3,2e-64; 

Matches 387; Conservative 0; Mismatches 135; Indels 0; Gaps ' 0; 

Qy 5 aacaatgagcactgcaagatttatcaagtgtgtcacggtcggtgatggagctgtggggaa 64 

ii linn ii i ii ii ilium inn ii ii urn iiimii i 

Db 166 AAAGATGAGCGCTTCTAGGTTCATCAAGTGCGTCACTGTTGGGGATGGTGCTGTGGGCAN 225 
Qy 65 aacttgtatgctcatttcatataccagcaatactttcccaacggattatgttccaacagt 124 

ill ii mi inn n ilium iiimii n nmiii n n n 

Db 226 AACCTGCTTGCHATTICCTACACCAGCAACACTTTCCCCACCGAITATGTGCCGACTGT 285 



Qy 125 atttgataactttagtgccaatgtggtggtggatggcagcacagtgaaccttggcctatg 184 

mil ii ii mil iiniin ii nn nn inn n n i n 

Db 286 TTTTGACAATTTCAGTGCAAATGTGGTTGTCAATGGGAGCATTGTGAATCTGGGTTTGTG 345 
Qy 185 ggacactgccgggcaagaagattataataggctaaggccactgagttatagaggagctga 244 

iii inn ii inn inn n n nn n imin i n n n 

Db 346 GGATACTGCTGGACAAGAGGATTACAACAGATTAAGACCTTTGAGTTACCGTGGTGCCGA 405 

Qy 245 tgtgtttttgttggccttttctcttataagcaaggccagttatgaaaacatctacaaaaa 304 

Db 406 TGTTTTCATATTGGCTTTCTCTCTCA^ 465 

Qy 305 gtggatcccagagctaagacattatgctcataatgtaccagttgtgcttgttggaaccaa 364 

1 1 1 1 ii min 1 1 ilium i i m n ii i ii mil n n 

Db 466 GTGGATTCCAGAGTTGAAGCATTATGCACCTGGTGTCCCCATTATTCTGGTTGGCACAAA 525 

Qy 365 actagatttgcgagatgacaagcagttcctcattgatcaccctggagcaacaccaatatc 424 

II II I II lllll E 1 1 1 1 1 1 1 1 III II II lllll II III II I 
Db 526 GCTTGACCTTCGGGATGATAAGCAGTTCTGCATCGACCATCCTGGTGCCGTACCTATTAC 585 

Qy 425 aacatctcagggagaagaactaaagaagatgataggagcagttacttatatagaatgcag 484 

ill mi minim ii i mi nn in nn n muni 

Db 586 CACAGCTCANGGAGAAGAGCTTAGGAAGCTGATTAATGCACCAGCTTACATTGAATGCAG . 645 

Qy 485 ctccaaaacccaacagaatgtgaaggctgttttcgatgctgc 526 

II lllll I I I 1 1 1 1 ! 1 1 1 II II lllll II 
Db 646 TTCAAAAACACCAGGAGACGTGAAGGCAGTCTTTGATGCAGC 687 



RESULT 10 
AW705028 

LOCUS AW705028 592 bp mRNA EST 18-APR-2000 

DEFINITION Sk41f03.yl Gm-cl019 Glycine max cDNA clone GENOME SYSTEMS CLONE ID: 
Gm-cl019-5142 5' similar to SW:RAC5_ARATH Q38937 RAC-LIKE GTP 
BINDING PROTEIN ARAC5. [1] ;, n! 
AW705028 

AW70502B.1 GI: 7589250 
EST, 



TITLE 
JOURNAL 
COMMENT 



ACCESSION 

VERSION 

KEYWORDS 



ORGANISM Glycine max 

Eukaryota; Viridiplantae; Embryophyta; Tracheophyta; Spermatophyta; 

Magnoliophyta; eudicotyledons; Rosidae; eurosids I; Fabalesf 

Fabaceae; Papilionoideae; Glycine, 
iFERENCE 1 (bases 1 to 592) 

AUTHORS Shoemaker, R., Keim,P., Vodkin,L., Erpelding, J. , Coryell, V., 

Khanna,A., Bolla,B., Marra,M., Hillier,L., Kucabaj., Martin, J. , 
Beck,C, Wylie,T., Underwood, K., Steptoe,M., Theising,B., Allen,M., 
Bowers, Y,, Person, B., Swaller, T., Gibbons, M., Pape,D., Harvey, N., 
Schurk,R,, Ritter,E., Kohn,S., Shin,!., Jackson, Y., Cardenas, M. , 
McCann,R,, Waterston,R, and Wilson, R. 
Public Soybean EST Project 
Unpublished (1999) 

On Jun 22, 1998 this sequence version replaced gi : 3246649 . 
Contact: Shoemaker R/Public Soybean EST Project 
Public Soybean EST Project 
Washington University School of Medicine 
4444 Forest Park Parkway, Box 8501, St. Louis, MO 63108, USA 
Tel: 314 286 1800 
Fax: 314 286 1810 
Email: est@watson.wustl.edu 

This clone is available through: Genome Systems, Inc, 4633 World 
Parkway Circle St, Louis, Missouri 63134 For further information 
call: (800) 430-0030 or (314) 427-3222 FAX: (888) 919-3324 or (314) 
427-3324 or contact: clones8genomesystems.com or 
info8genomesystems.com web site: www.genomesystems.com 
Seq primer: -40RP from Gibco 
High quality sequence stop: 430. 
Location/Qualifiers 
1, .592 

/organism-'Glycine max" 
/dbjcref-"taxon:3847" 

/clone- "GENOME SYSTEMS CLONE ID: Gm-cl019-5142" 



FEATURES 
source 
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University of Minnesota 

411 Borlaug Hall, 1991 Upper Buford Circle, St, Paul, MN 55108 USA 

Tel: 612-625-7219 

Fax: 651-649-5058 

Email: vance004toaroon.tc.umn.edu 

Minnesota EST name:M250659e ; TIGR sequence narae:MTBAD48TK ; More 
information, including clone ordering, is available at. . 
' http : //chr y s ie . tamu . edu/medicago ' 
Seq primer: SKmod (CTA gAA CTA gtg gAT CC). 
FEATURES Location/Qualifiers 
source 1. .669 

/organism-"Medicago truncatula" 

/cultivar-'genotype A17" 

/dbjtref-"taxon:3880" 

/clone-*pDSIR-7H24" 

/clone_lib-"DSlR" 

/tissue.type-" infected root" 

/note- "vector: pBluescript SK +/■; SIteJL: EcoRl; SiteJ: 
Xhol; roots infected with Phytophtora medicaginis " 

BASE COUNT 175 a 125 c 162 g 205 t 2 others 

ORIGIN 



Query Match 35.6*; Score 324.4; DB 74; Length 669; 

Best Local Similarity 75.24; Pred. No. 2.2e-68; 

Matches 403; Conservative 0; Mismatches 133; Indels 0; Gaps 

Qy 2 aaaaacaatgagcactgcaagatttatcaagtgtgtcacggtcggtgatggagctgtggg 61 

iii i nun ii i ii ii iiiiiiiiiii ii ii ii milium n 

Db 133 AAAGAAGATGAGCGCTTCTAGGTTCATCAAGTGTGTTACTGTTGGGGATGGAGCTGTTGG 192 
Qy 62 gaaaacttgtatgctcatttcatataccagcaatactttcccaacggattatgttccaac 121 

iiimm ii i ilium iimimii inn ii ii inn mil 

Db 193 TAAAACTTGTTTGTTAATTTCATACACCAGCAATACCTTCCCCACTGACTATGTGCCAAC 252 
Qy 122 agtatttgataactttagtgccaatgtggtggtggatggcagcacagtgaaccttggcct 181 

ii ii ii ii ii inn mum m mi nm inn n n i 

Db 253 TGTCTTCGACAATTTCAGTGCAAATGTGGTTGTGAATGGAAGCACTGTGAATCTGGGTTT 312 
Qy 182 atgggacactgccgggcaagaagattataataggctaaggccactgagttatagaggagc 241 

iiiiiiiiiii ii nm mum ii mi ii miini i n n 

Db 313 GTGGGACACTGCAGGACAAGAGGATTATAACAGATTAAGACCTTTGAGTTATCGTGGTGC 372 

Qy 242 tgatgtgtttttgttggccttttctcttataagcaaggccagttatgaaaacatctacaa 301 

Mill II I I II II II II II 1 1 1 1 11 M 1 1 1 1 1 1 M 1 1 1 1 1 I I III 
Db 373 CGATGTTTTCATTCTCGCTTTCTCCCTCATAAGCAAGGCCAGTTATGAAAATGTTTCCAA 432 

Qy 302 aaagtggatcccagagctaagacattatgctcataatgtaccagttgtgcttgttggaac 361 

miiiiii mm i i minim i i m ii m i ii nm n 

Db 433 AAAGTGGATTCCAGAGTTGAAGCATTATGCACCTGGTGTTCCCATTATTCTGGTTGGCAC 492 

Qy 362 caaactagatttgcgagatgacaagcagttcctcattgatcaccctggagcaacaccaat 421 

II II II I II 1 1 1 1 1 1 1 1 1 1 ' 1 1 II I II II II I II II II 
Db 493 AAAGCTTGACCTTCGGGATGACAAGCAGTTCTTCGTCGACCATCCAAGTGCTGTTCCTAT 552 

Qy 422 atcaacatctcagggagaagaactaaagaagatgataggagcagttacttatatagaatg 481 

i ii mi miiiiii ii i mi mi iii i mini nm 

Db 553 TACCACTGCTCANGGAGAAGAGCTTANGAAGCTGATCAATGCACCTGCTTATATCGAATG 612 
Qy 482 cagctccaaaacccaacagaatgtgaaggctgttttcgatgctgcaataaaagtag 537 

iii ii iii i ii minimi n n n nm n mi mi i 

Db 613 CAGTTCGAAATCACAGCAGAATGTGAAAGCAGTCTTTGATGCAGCCATAAGAGTTG 668 



RESULT 7 
AW349629/C 

LOCUS AW349629 796 bp mRNA EST 01-FEB-2000 

DEFINITION GM210005B21A12R Gm-rl021 Glycine max cDNA clone Gm-rl021-1584 3', 

mRNA sequence. 

ACCESSION AW349629 

VERSION AW349629.1 GI: 6847339 

KEYWORDS EST, 

SOURCE soybean, 



ORGANISM Glycine max 

Eukaryota; viridiplantae; Embryophyta; Tracheophyta; Spermatophyta; 

Magnoliophyta; eudicotyledons; Rosidae; eurosids I; Fabales; 

Fabaceae; Papilionoideae; Glycine. 
REFERENCE 1 (bases 1 to 796) 
AUTHORS Vodkin,L., Kelm,P., Shoemaker, R,, Retzel,E,, Khanna,A., Coryell, V,, 

Erpelding, J. , Raph,C, Shoop,E. , Pardinas, J. , Liu,L. and Lewin,H, 
TITLE A Functional Genomics Program for Soybean (NSF 9872565) 
JOURNAL Unpublished (1999) 
COMMENT On Oct 8, 1998 this sequence version replaced gi: 3727950. 

OtherJSTs: AI440994 

Contact: Vodkin, L.O., PI, A Functional Genomics Program for 
Soybean (NSF 9872565) 

Lewin, H. A., Director, Keck Center for Comparative and Functional 
Genomics • 
University of Illinois 

Edwin R. Madigan Building, 1201 W. Gregory, Urbana, IL 61801, USA 
Tel: (217) 244-6147 
Fax: (217) 333-4582 
Email: l-vodkin@uluc.edu 

This clone is available through: Genome Systems, Inc. 4633 World 
Parkway Circle St. Louis, Missouri 63134. For further information 
call: (800) 430-0030 or (314) 427-3222 FAX: (888)919-3324 or (314) 
427-3324 or contact:clones?genomesystems.coni or infoUgenome 
ystems.com web site:www. genomesystems.com 
Seq primer: 5 ' -TTTTTTTTTTTTTTTTTTTT ( A/C/G ) • 3 ' . 
FEATURES Location/Qualifiers 
source 1. .796 

/organ ism- "Glycine max" 

/cultivar-'Williams" 

/dbjtref-"taxon:3847" 

/clone-"Gm-rl021-1584" 

/clone.lib-"Gm-rl021" 

/t is s ue.type- " root " 

/lab_host-"XL10-Gold" 

/note- "Vector: pBluescript II XR; Site.l: EcoRI; SiteJ: 
Xhol; Library Gm-rl021 is a sequence-driven, reracked set 
of the original library Gm-cl004 which was prepared from 
root cDNA. The mRNA was isolated from entire roots' of 8 
day old 'Williams' seedlings which were propagated, on 
paper towels with distilled water. Stratagene's cDNA 
Synthesis Kit (catalog 1200401) was used to synthesize 
the cDNA. The Gm-cl004 library was constructed by'Dr. 
Paul Keira s Virginia H. Coryell, Department of Biology, 
Box5640, Northern Arizona University, Flagstaff, AZ 
86011, email: paul,keim@nau.edu, virginia.coryell@nau.edu. 
The contig analysis to select unique genes was performed 
by the laboratory of Ernest Retzel, Computational Biology 
Centers, University of Minnesota, 
http : //www . cbc . umn . edu/ResearchProj ects/Soybean/index . html 
. Reracking was performed by Genome Systems, St. Louis, 
http://www.genomesystems.com, and sequencing by the Keck 
Center for Comparative and Functional Genomics, 
University of Illinois, 
http : //www . 1 if e . uiuc . edu/biotech/keck . html . " 

BASE COUNT 222 a 168 c 150 g 225 t 31 others 

ORIGIN 



Query Match 35.0%; Score 318.6; DB 71; Length 796; 

Best Local Similarity 69.0%; Pred. no. 5.7e-67; 

Matches 411; Conservative 0; Mismatches 185; Indels 0; Gaps 

Qy 6 acaatgagcactgcaagatttatcaagtgtgtcacggtcggtgatggagctgtggggaaa 65 

1 Mini 1 ii 11 nm 111 11 11 11 nm m 

Db 794 AAAATGAGTGNNNNGANGTTCATTAAGTGCGTCNNNNTCNNCGACGGTGCTGTCNNNAAA 735 
Qy 66 acttgtatgctcatttcatataccagcaatactttcccaacggattatgttccaacagta 125 

1 1 1 mi 1 11111111 nm 11 nm mn n i i 

Db 734 NNNNGCNNGNTGATTTNNNACACCAGCAACACTTTTCCCACGGACTATGTGCCCANNNTT 675 
Qy 126 tttgataactttagtgccaatgtggtggtggatggcagcacagtgaaccttggcctatgg 185 

. mn 11 11 inn iiiiiiiiiiiiiiiii mm mi mi iii 
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Email: dfriSChSCLEMSON.EDU 
5 prime sequence. 

Location/Qualifiers 
3 1. .732 

/organism- "Lycopersicon esculentum" 

/cultivar-"Rio Grande PtoR" 

/db_xref-'taxon:408r 

/clone- "CLET19J17" 

/clone.lib-'tomato mixed elicitor, BTI" 
/tissue_type-"leaf" 
/dev_stage-"4-6 week old plants" 
/lab_host»'XLl-Blue MRF" 

/note-'Vector: pBlueScript SK(-); Site„l: EcoRl; Site.2: 
Xhol; cLET - Inoculated with a variety of disease response 
elicltors. Plants exposed to 2,6 dichloroisonicotinic 
acid, BTH, jasmonic acid, ethylene, fenthion, EIX, 
okadaic acid, or systemin prior to tissue harvest. EcoRI 
site was destroyed during cloning," 
213 a 119 c 194 g 206 t 



Query Match 37,1%; 
Best Local Similarity 74.1%; 
Matches 427; Conservative 



Score 337.6; DB 63; Length' 732; 
Pred. No. 1.5e-71; 
0; Mismatches 149; Indels 0; 



Gaps 



Qy 

Db 

Qy 
Db 

Qy 

Db 



5 aacaatgagcactgcaagatttatcaagtgtgtcacggtcggtgatggagctgtggggaa 64 

ii nun n i ii inn iiiiiin ii ii ii iniiiiiniiii ii 

113 AAGAATGAGTGCTTCTAGGTTTATAAAGTGTGTTACCGTGGGCGATGGAGCTGTGGGTAA 172 
65 aacttgtatgctcatttcatataccagcaatactttcccaacggattatgttccaacagt 124 

Hum I iiiiiin minimi inn ii ii mum mil n 

173 AACTTGTCTTCTCATTTCGTATACCAGCAACACTTTTCCCACTGATTATGTCCCAACTGT 232 
125 atttgataactttagtgccaatgtggtggtggatggcagcacagtgaaccttggcctatg 184 

nun ii iiiiiin linn ii inn inn n n n n i n 

233 ATTTGACAATTITAGTGCAAATGTGGTTGTCGATGGGAGCACTGTTAATCTGGGGCTCTG 292 
185 ggacactgccgggcaagaagattataataggctaaggccactgagttatagaggagctga 244 

iii inn ii ii ii mil urn mi ii mi iii i ii inn 

293 GGATACTGCAGGTCAGGAGGATTACAATAGATTAAGACCTTTGAGCTATCGTGGGGCTGA 352 
24 5 tgtgtttttgttggccttttctcttataagcaaggccagttatgaaaacatctacaaaaa 304 

m iii i mi i u 1 1 1 1 1 ii iiiiiiii n iii iii mm 

353 TGTATTTATACTGGCATTTTCTCTCATTAGCAAGGCGAGCTATGAAAATGTCTCCAAAAA 412 

305 gtggatcccagagctaagacattatgctcataatgtaccagttgtgcttgttggaaccaa 364 

llllll II II I II IIIIIIIIII I II III II I IIIIIIIIII II 
413 GTGGATTCCTGAATTGAGGCATTATGCTCCTGGAGTTCCAATTATTCTTGTTGGAACAAA 472 

365 actagatttgcgagatgacaagcagttcctcattgatcaccctggagcaacaccaatatc 424 

llllll I lllll II lllll III I I II II II II II III I 
473 GCTAGATCTCCGAGAGGATAAGCAATTCTTTGTGGACCATCCAGGTGCTGTTCCACTTAG 532 

4 2 5 aacatctcagggagaagaactaaagaagatgataggagcagttacttatatagaatgcag 484 

ii mini ii ii ii i iii iii ii n i i mi n mil n 

533 CACTGCTCAGGGTGAGGAGCTGAGAAAGTCGATTGGTGCTGCTGCTTACATTGAATGTAG 592 

4 8 5 ctccaaaacccaacagaatgtgaaggctgttttcgatgctgcaataaaagtagctttgag 544 

I Mill IIIIIIII I 1 1 1 1 1 1 H I M lllll II II INN I 
593 TGCAAAAACTCAACAGAACATTAAGGCTGTTTTTGATGCGGCCATTAAGGTGGTCCTACA 652 

545 gccaccaaaaccaaagagaaagccttgcaaaaggag 580 

lllll II I Mill III llllll 
653 ACCACCCAAGCAAAAGAAGAAGAAGAGGAGAAAGGG 688 



RESULT 4 
AW040005 

LOCUS AW040005 732 bp mRNA EST 18-OCM999 

definition EST282496 tomato mixed elicitor, BTI Lycopersicon esculentum cDNA 

clone CLET19J17, mRNA sequence, 
ACCESSION AW040005 
•fj.i. .,>.' . , 



VERSION 
KEYWORDS 
SOURCE 
ORGANISM 



REFERENCE 
AUTHORS 



TITLE 
JOURNAL 
COMMENT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



AW040005.1 GI:5898759 

EST. 

tomato. 

Lycopersicon esculentum 

Eukaryota; Viridiplantae; Embryophyta; Tracheophyta; Speraatophyta; 
Magnoliophyta; eudicotyledons; Asteridae; euasterids I; Solanales; 
Solanaceae; Solanum; Lycopersicon. 
1 (bases 1 to 732) 

D'Ascenzo,M., He,X., Lyman, J., Holt, I.E., Liang, F., Upton, J., 
Ronning,C.M., Craven, M.B., Fujii,C.Y,, Bowman, C.L., Nierinan,w., 
Fraser,C.M., Venter, J. C, Martin, G.B., Tanksley,S,D. and 
Giovannoni,J. 

Generation of ESTs from tomato leaf tissue 
Unpublished (1999) 

On Jun 5, 1998 this sequence version replaced gi: 3188234. 

Contact: David Frisch 

Clemson University Genomics Institute 

Clemson University 

100 Jordan Hall, Clemson, SC 29634, USA 

Tel: 864 656 4366 

Fax: 864 656 4293 

Email: dfrisch@CLEMSON.EDU 

3 prime sequence, 

Location/Qualifiers 

1. .732 

/organism-'Lycopersicon esculentum" 
/cultivar="Rio Grande PtoR" 
/dbjtref-"taxon:4081" 
/clone-"cLETl9Jl7" 

/cloneJLib- "tomato mixed elicitor, BTI" 
/tissue_type="leaf" 
/dev_stage»"4-6 week old plants" 
/lab_host="XLl-Blue mrf'" 

/note- "Vector: pBlueScript SR(-); SiteJ: EcoRI; SiteJ: 
Xhol; cLET • Inoculated with a variety of disease response 
elicitors. Plants exposed to 2,6 dichloroisonicotinic 
acid, BTH, Jasmonic acid, ethylene, fenthion, EIX, ; 
okadaic acid, or systemin prior to tissue harvest. -"EcoRI 
site was destroyed during cloning." ""' 
213 a 119 c 194 g 206 t 



Query Match 37.1%; 
Best Local Similarity 74.1%; 
Matches 427; Conservative 



Score 337,6; DB 63; Length 732; 
Pred. No. 1.5e-71; 
0; Mismatches 149; Indels 0; 



Qy 5 aacaatgagcactgcaagatttatcaagtgtgtcacggtcggtgatggagctgtggggaa 64 

II llllll II I II lllll IIIIIIII II II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 

Db 113 AAGAATGAGTGCTTCTAGGTTTATAAAGTGTGTTACCGTGGGCGATGGAGCTGTGGGTAA 172 

Qy 65 aacttgtatgctcatttcatataccagcaatactttcccaacggattatgttccaacagt 124 

mini i mum iiiiiiiMii nm n n iiiiiiii iiim ii 

Db 173 AACTTGTCTTCTCATTTCGTATACCAGCAACACTTTTCCCACTGATTATGTCCCAACTGT 232 

Qy 125 atttgataactttagtgccaatgtggtggtggatggcagcacagtgaaccttggcctatg 184 

llllll II IIIIIIII IIIIIIM II lllll lllll II II II II II II 
Db 233 ATTTGACAATTTTAGTGCAAATGTGGTTGTCGATGGGAGCACTGTTAATCTGGGGCTCTG 292 

Qy 185 ggacactgccgggcaagaagattataataggctaaggccactgagttatagaggagctga 244 

III IIIM II II II IIIM Mill MM II IMI III I II lllll 

Db 293 GGATACTGCAGGTCAGGAGGATTACAATAGATTAAGACCTTTGAGCTATCGTGGGGCTGA 352 

Qy 245 tgtgtttttgttggccttttctcttataagcaaggccagttatgaaaacatctacaaaaa 304 

iii iii I nn iimiM m iiiiiiii ii iiiiiiii iii mm 

Db 353 TGTATTTATACTGGCATTTTCTCTCATTAGCAAGGCGAGCTATGAAAATGTCTCCAAAAA 412 

Qy 305 gtggatcccagagctaagacattatgctcataatgtaccagttgtgcttgttggaaccaa 364 

llllll II II I II IIIIIIIIII I II III II I MM! II 

Db 413 GTGGATTCCTGAATTGAGGCATTATGCTCCTGGAGTTCCAATTATTCTTGTTGGAACAAA 472 

Qy 365 actagatttgcgagatgacaagcagttcctcattgatcaccctggagcaacaccaatatc 424 

llllll I lllll II lllll III I I II IMI II II III I 
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gb_gssl3 
gb„gssl4 
gb_gssl5: 
gb_gssl6 
gb_gssl7 
gb_gssl8 
gb_gssl9 
enugssl3: 



Pred, No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution, 



% 

Query 



SUMMARIES 





NO. 


Score 


Match Length 


DB 


ID 


Description 




1 


379,2 


41.7 


662 


79 


AW690945 


AW690945 NF034H11S 




2 


339.8 


37.3 


595 


46 


AI937960 


AI937960 sc06bll.y 




3 


337,6 


37.1 


732 


63 


AW039993 


AW039993 EST282484 




4 


337.6 


37.1 


732 


63 


AW040005 


AW040005 EST282496 




5 


336.8 


37.0 


585 


71 


AW394676 


AW394676 sh34a02,y 




6 


324.4 


35.6 


669 


74 


AW559248 


AW559248 EST306084 


c 


7 


318.6 


35.0 


796 


71 


AW349629 


AW349629 GM210005B 




8 


308.4 


33.9 


649 


44 


AI759963 


AI759963 sb66hll.y 




9 


307.2 


33.8 


688 


45 


AI900160 


AI900160 sc01fl2.y 




10 


306.8 


33.7 


592 


80 


AW705028 


AW705028 Sk41f03.y 




11 


306.6 


33.7 


555 


45 


AI901151 


AI901151 sc21cl2.y 




12 


305.6 


33.6 


658 


45 


AI900170 


AI900170 sc01gl2.y 




13 


304.2 


33.4 


549 


45 


AI901141 


AI901141 sc21bl2.y 




14 


303.4 


33,3 


713 


44 


AI759954 


AI759954 sb66gll.y 




15 


299 


32.9 


625 


79 


AW688369 


AW688369 NP006P02S 




16 


295.4 


32.5 


506 


46 


AI965741 


AI965741 sc75d09.y 




17 


291 


32.0' 


533 


74 


AW573665 


AW573665 EST316256 




18 


290,2 


31.9 


549 


36 


AI162543 


AI162543 A019P20U 




19 


282 


31.0 


606 


46 


AI941239 


AI941239 sb86cl0,y 




20 


278,6 


30.6 


485 


74 


AW573660 


AW573660 EST316251 




21 


277.2 


30.5 


469 


47 


AU029919 


AU029919 AU029919 




22 


267.2 


29.4 


517 


80 


AW705209 


AW705209 sk43all.y 




23 


266.4 


29.3 


622 


43 


AI727570 


AI727570 BNLGH1842 




24 


264.8 


29.1 


410 


74 


AW559842 


AW559842 EST314890 




25 


262 


28.8 


680 


79 


AW690086 


AW690086 NP028B10S 




26 


261.2 


28.7 


435 


44 


AI812534 


AI812534 12D8 Pine 




27 


257.2 


28.3 


437 


69 


AW202293 


AW202293 sfl3cl0.y 




28 


255,4 


28,1 


638 


43 


AI731040 


AI731040 BNLGH1845 




29 


252.2 


27.7 


463 


48 


AU082692 


AU082692 AU082692 




30 


249,8 


27.5 


401 


40 


AI495724 


AI495724 sbl5e06.y 




31 


249.8 


27.5 


568 


44 


AI775563 


AI775563 EST256663 




32 


247.2 


27.2 


696 


80 


AW694335 


AW694335 NFQ75C06S 




33 


244 


26.8 


698 


64 


AW109094 


AW109094 gate0002P 




34 


237.8 


26.1 


556 


69 


AW219991 


AW219991 EST302474 




35 


236.6 


26.0 


551 


79 


AW621657 


AW621657 EST312455 




36 


231.8 


25,5 


560 


80 


AW738459 


AW738459 ESI339886 




37- 


229 


25.2 


688 


79 


AW685566 


AW685566 NF031H02N 




38 


227 


24.9 


617 


69 


AW218480 


AW218480 EST303663 




39 


225 


24,7 


524 


43 


AI730323 


AI730323 BNLGH1662 


c 


40 


224.6 


24,7 


621 


62 


AV440631 


AV440631 AV440631 




41 


224.4 


24.7 


460 


44 


AI795130 


AI795130 sb77dll.y 




42 


218.4 


24.0 


720 


64 


AW108575 


AW108575 gateOOOH 




43 


215,2 


23.6 


353 


48 


AU058227 


AU058227 AO058227 




44 


214.6 


23.6 


378 


40 


AI460950 


AI460950 sa78f02.y 


c 


45 


212,8 


23.4 


616 


47 


AI999128 


AI999128 701554548 



AW690945 

LOCOS AW690945 662 bp mRNA EST 17-APR-2000 

DEFINITION NF034H11ST1F1000 Developing stem Medicago truncatula cDNA clone 
NF034H11ST 5', mRNA sequence. 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 
ORGANISM 



REFERENCE 
AUTHORS 



JOURNAL 
COMMENT 



FEATURES 
source 



AW690945 

AW690945.1 61:7565604 
EST, 

barrel medic. 
Medicago truncatula 

Eukaryota; Viridiplantae; Embryophyta; Tracheophyta; Spermatophyta; 
Magnoliophyta; eudicotyledons; Rosidae; eurosids I; Fabales; 
Fabaceae; Papilionoideae; Medicago. 
1 (bases 1 to 662) 

Torres -Jerez, I., Scott,A.D., Harris, A, R, , Gonzales, R, A., Bell, C, J., 

Flores,H.R., Inman, J.T. , Weller, J.W. and May,G.D. 

Expressed Sequence Tags from the Samuel Roberts Noble Foundation - 

Center for Medicago Genomics Research 

Unpublished (2000) 

On Jan 6, 2000 this sequence version replaced gi: 6676601, 

Contact: Dixon RA 

Plant Biology Division 

The Samuel Roberts Noble Foundation 

2510 Sam Noble Parkway, Ardmore, OK 73402, USA 

Tel: 580 221 7302 

Fax: 580 221 7380 

Email: radixon@noble.org 

Insert Length: 662 Std Error: 0.00 

Plate: 034 row: H column: 11 

Seq primer: TCACACAGGAAACAGCTATGAC . 

Location/Qualifiers 

1. .662 

/organism- "Medicago truncatula" 
/db_xref-"taxon:3880" 
/clone- "NF034H11ST" 
/clone_lib""Developing stem" 
/tissue_type-"stem" 
/dev_stage«" Pooled developmental " 
/note- "Vector: Lambda zap; Contains a mixture of 
internodal stem sepents" 
207 a 100 c 140 g 211 t 4 others 



Query Match 41.7%; 
Best Local Similarity 81.94; 
Matches 435; Conservative 



Score 379.2; DB 79; Length 662; 

Pred. No. 1.3e-81; 

); Mismatches 96; Indels 0; 



-Gaps 



Qy 2 aaaaacaatgagcactgcaagatttatcaagtgtgtcacggtcggtgatggagctgtggg 61 

II II HUM Hill II llllllllllllll II II II Mill Mill II 

Db 121 AAGAAGAATGAGTACTGCTAGGTTTATCAAGTGTGTAACAGTAGGGGATGGTGCTGTTGG 180 

Qy 62 gaaaacttgtatgctcatttcatataccagcaatactttcccaacggattatgttccaa'c 121 

llllllll lllll II II Mill llllllll II II II llllllllllllll 
Db 181 AAAAACTTGCATGCTTATATCCTATACAAGCAATACCTTTCCCACTGATTATGTTCCAAC 240 

Qy 122 agtatttgataactttagtgccaatgtggtggtggatggcagcacagtgaaccttggcct 181 

ii nm ii ii urn iiiii Milium n inn n inn i 

Db 241 TGTGTTTGACAATTTCAGTGCTAATGTAGTGGTGGATGGTAGTACAGTTAATCTTGGTTT 300 
Qy 182 atgggacactgccgggcaagaagattataataggctaaggccactgagttatagaggagc 241 

mm iiiii ii Minimi nm iiiiim limn 

Db 301 ATGGGATACTGCAGGACAAGAAGATTACAATAGATTAAGGCCATTGAGTTACAGAGGAGC 360 
Qy 242 tgatgtgtttttgttggccttttctcttataagcaaggccagttatgaaaacatctacaa 301 

iiiiiiiimiiiii iiiiiiii ii ii ii ii mum iiiii hi 

Db 361 TGATGTGTTTTTGTTGTGTTTTTCTCTCATTAGTAAAGCTAGTTATGAGAACATTTCCAA 420 

Qy 302 aaagtggatcccagagctaagacattatgctcataatgtaccagttgtgcttgttggaac 361 

lllllllll I IIIII 1111111111111 IIIII II lllllll II Mill 
Db 421 AAAGIGGATATCTGAGCTGAGACATTATGCTCCAAATGTGCCTATTGTGCTGGTGGGAAC 480 

Qy 362 caaactagatttgcgagatgacaagcagttcctcattgatcaccctggagcaacaccaat 421 

III II llllllllll llllllll II I II IIIII llllllll INI III' 

Db 481 AAAATTANATTTGCGAGACGACAAGCAATTTTTTATCGATCATCCTGGAGCGACACAAAT 540 

Qy 422 atcaacatctcagggagaagaactaaagaagatgataggagcagttacttatatagaatg 481 

i mi ii mi mm mini nm miiiiin in n 11 n 
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II III I Nil I III Nil I I lllllll! II 111)111! II I 
Db 223 TGCMGTGAAAGGCAMCCTGTGCACCTCCACATCTGGGACACAGCAGGGCAAGATGACT 282 

Qy 208 ataataggctaaggccactgagttatagaggagctgatgtgtttttgttggccttttctc 267 

Db 283 ATGACCGCCTGCGGCCCCTGTTCTACCCTGACGC^ 342 

Qy 268 ttataagcaaggccagttatgaaaacatctacaaaaagtggatcccagagctaagacatt 327 

1 1 ill i ill i ill iiiiiii ii mi inn 1 1 in 

Db 343 TCACCAGCCCGAACAGCTTTGACAACAICTTTAACCGGTGGTACCCAGAAGTGAATCATT 402 

Qy 328 atgctcataatgtaccagttgtgcttgttggaaccaaactagatttgcgagatgacaa 385 

I II Hill I I I II II III II MM I Mill 
Db 403 TCTGCAAGAAGGTACCCATCATCGTCGTGGGCTGCAAGACTGACCTGCGCAAGGACAA 460 



RESULT 13 
US-08-766-551-2 

Sequence 2, Application US/08766551 

Patent No. 5840569 
GENERAL INFORMATION: 



APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 



Hillman, Jennifer L. 
Bandman, Olga 
Hawkins, Phillip R. 
Goli, Surya K. 
TITLE OF INVENTION: NOVEL HUMAN GTP -BINDING PROTEINS 
NUMBER OF SEQUENCES: 9 
CORRESPONDENCE ADDRESS: 
ADDRESSEE: INCYTE PHARMACEUTICALS , INC. 

STREET: 3174 Porter Drive 

CITY: Palo Alto 

STATE: CA 

COUNTRY: US 

ZIP: 94304 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 

COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/766,551 

FILING DATE: Herewith 

CLASSIFICATION: 514 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 

FILING DATE: 
ATTORNEY/AGENT INFORMATION: 

NAME: Billings, Lucy J. 

REGISTRATION NUMBER: 36,749 

REFERENCE/DOCKET NUMBER: PF-0168 US 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 415-855-0555 

TELEFAX: 415-845-4166 

TELEX: 

INFORMATION FOR SEQ ID NO: 2: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 719 base pairs 

TYPE: nucleic acid 

STRANDEDNESS: single 

TOPOLOGY: linear 
IMMEDIATE SOURCE: 

LIBRARY: SEQ ID NO: 2 

CLONE: 113700 
US-08-766-551-2 



Query Match 10.2%; Score 93; DB 3; Length 719; 

Best Local Similarity 54.2%; Pred. No, 4e-16; 

Matches 189; Conservative 0; Mismatches 160; Indels 0; Gaps 

Qy 52 gagctgtggggaaaacttgtatgctcatttcatataccagcaatactttcccaacggatt 111 

I II II III II II I II III III I III II II I 
Db , 33 GCGCGGTTTGGATGACAAACATTGGGGTGAGCTACACCCCCAACGGCTACCCCACCGAGT 92 



112 atgttccaacagtatttgataactttagtgccaatgtggtggtggatggcagcacagtga 171 

I I II II I II M lllll II III lllllll! I I MM 

93 ACATCCCTACTGCCTTCGACAACTTCTCCGCGGTGGTGTCTGTGGATGGGCGGCCCGTGA 152 

172 accttggcctatgggacactgccgggcaagaagattataataggctaaggccactgagtt 231 

II II II lllllllllll II II II II I I III lllll II I I 
153 GACTCCAACTCTGTGACACTGCCGGACAGGATGAATTTGACAAGCTGAGGCCTCTCTGCT 212 

232 atagaggagctgatgtgtttttgttggccttttctcttataagcaaggccagttatgaaa 291 

II I II I I! II I III II I III I III 

213 ACACCAACACAGACATCTTCCTGCTCTGCTTCAGTGTCGTGAGCCCCTCATCCTTCCAGA 272 

292 acatctacaaaaagtggatcccagagctaagacattatgctcataatgtaccagttgtgc 351 

II II I II III I II III III I II IN II I I I 
273 ACGTCAGTGAGAAATGGGTGCCGGAGATTCGATGCCACTGTCCCAAAGCCCCCATCATCC 332 

352 ttgttggaaccaaactagatttgcgagatgacaagcagttcctcattga 400 

I llllllll I III I II II II I llllllllll 
333 TAGTTGGAACGCAGTCGGATCTCAGAGAAGATGTCAAAGTCCTCATTGA 381 



RESULT 14 
US-08-247-946A-5 

Sequence 5, Application US/08247946A 
Patent No. 5792638 
GENERAL INFORMATION: 

APPLICANT: AARONSON, S.A.; CHAN, A.; 

APPLICANT: MIKI, T. 

TITLE OF INVENTION: NOVEL HUMAN RAS-RELATED 

TITLE OF INVENTION: ONCOGENES UNMASKED BY EXPRESSION O 

TITLE OF INVENTION: CLONING 

NUMBER OF SEQUENCES: 11 

CORRESPONDENCE ADDRESS: 

ADDRESSEE: MORGAN S FINNEGAN 

STREET: 345 PARK AVENUE 

CITY: NEW YORK 

STATE: NEW YORK 

COUNTRY: USA 

ZIP: 10154 
COMPUTER READABLE FORM: 

MEDIUM TYPE: FLOPPY DISK 

COMPUTER: IBM PC COMPATIBLE 

OPERATING SYSTEM: PC -DOS/MS -DOS 

SOFTWARE: WORDPERFECT 5.1 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/247, 946A 

FILING DATE: 24-MAY-1994 

CLASSIFICATION: 536 
ATTORNEY/AGENT INFORMATION: 

NAME: DOROTHY R. AUTH 

REGISTRATION NUMBER: 36,434 

REFERENCE/DOCKET NUMBER: 2026-4150 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (212) 758-4800 

TELEFAX: (212) 751-6849 

TELEX: 421792 
INFORMATION FOR SEQ ID NO: 5: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 615 

TYPE: Nucleic acid 

STRANDEDNESS: Double 

TOPOLOGY: Unknown 
MOLECULE TYPE: CDNA 
HYPOTHETICAL: No 
ORIGINAL SOURCE: 

ORGANISM: Human 

STRAIN: 

INDIVIDUAL ISOLATE: 
DEVELOPMENTAL STAGE: 
HAPLOTYPE: 
TISSUE TYPE: 
CELL TYPE: 
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CITY: Palo AltO 

STATE: CA 

COUNTRY : USA 

ZIP: 94304 
COMPOTER READABLE FORM: 

MEDIUM TYPE; Diskette 

COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ for Windows Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/842,976 

FILING DATE: Herewith 

CLASSIFICATION: 514 
PRIOR APPLICATION DATA: - 

APPLICATION NUMBER: 

FILING DATE: 
ATTORNEY/AGENT INFORMATION: 

NAME: Billings, Lucy J. 

REGISTRATION NUMBER: 36,749 

REFERENCE/DOCKET NUMBER: • PF-0267 US 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 415-855-0555 

TELEFAX: 415-845-4166 

TELEX ' 

INFORMATION FOR SEQ ID NO: 2: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 702 base pairs 

TYPE: nucleic acid 

STRANDEDNESS: single 

TOPOLOGY: linear 
IMMEDIATE SOURCE: 

LIBRARY: LUNGNOT10 

CLONE: 1379718 
-08-842-976-2 



Query Match 11.5%; 
Best Local Similarity 55.7%; 
Matches 201; Conservative 



Score 105; DB 5; Length 702; 

Pred. No. 2.5e-19; 

3; Mismatches 160; Indels ( 



Qy 28 tcaagtgtgtcacggtcggtgatggagctgtggggaaaacttgtatgctcatttcatata 87 

Mill II! Ill II II II I UNI || | Nil || ' | 
Db 97 TCAAGGTGGTCCTGGTGGGCGACGGCGGCTGCGGGAAGACGTCGCTGCTGATGGTCTTCG 156 

Qy 88 ccagcaatactttcccaacggattatgttccaacagtatttgataactttagtgccaatg 147 

II I Mill I II II II II III I I I III 
Db 157 CCGATGGGGCCTTCCCCGAGAGCTACACCCCCACGGTGTTTGAGCGGTACATGGTCAACC 216 

Qy 148 tggtggtggatggcagcacagtgaaccttggcctatgggacactgccgggcaagaagatt 207 

ii iii i mi i iii mi i i minim n inn inn i 

Db 217 TGCAAGTGAAAGGCAAACCTGTGCACCTCCACATGTGGGACACAGCAGGGCAGGAAGACT 276 
Qy 208 ataataggctaaggccactgagttatagaggagctgatgtgtttttgttggccttttctc 267 

ii ii i ii mi ii ii i mini i i n in n 

Db 277 ATGATCGACTGCGGCCTCTCTCCTACCCGGACACTGATGTCATCCTCATGTGCTTCTCCA 336 

Qy 268 ttataagcaaggccagttatgaaaacatctacaaaaagtggatcccagagctaagacatt 327 

I III I III 1 1 1 1 1 1 1 ! I lllllll Mill! I I IN 
Db 337 TCGACAGCCCTGACAGCCTGGAAAACATTCCTGAGAAGTGGACCCCAGAGGTGAAGCACT 396 

Qy 328 atgctcataatgtaccagttgtgcttgttggaaccaaactagatttgcgagatgacaagc 387 

I II INI I I II II I! I II II II I I III III 
Db 397 TCTGCCCCAACGTGCCCATCATCCTGGTGGGGAATAAGAAGGACCTGAGGCAAGACGAGC 456 



Db 457 A 457 



RESULT 10 
US-09-213-397-2 

; Sequence 2, Application US/09213397 
; Patent No. 6063377 



GENERAL INFORMATION: 
APPLICANT: Hillman, Jennifer L. 
APPLICANT: Goli, Surya R. 
TITLE OF INVENTION: NOVEL HUMAN RHO PROTEIN 
NUMBER OF SEQUENCES: 4 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Incyte Pharmaceuticals, Inc. 

STREET: 3174 Porter Drive 

CITY: Palo Alto 

STATE: CA 

COUNTRY: USA 

ZIP: 94304 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 

COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ for Windows Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/213,397 

FILING DATE: 

CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/842,976 

FILING DATE: 04/17/1997 
ATTORNEY/AGENT INFORMATION: 

NAME: Billings, Lucy J. 

REGISTRATION NUMBER: 36,749 

REFERENCE/DOCKET NUMBER: PF-0267 US 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 415-855-0555 

TELEFAX: 415-845-4166 

TELEX: 

INFORMATION FOR SEQ ID NO: 2: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 702 base pairs 

TYPE: nucleic acid 

STRANDEDNESS: single 

TOPOLOGY: linear 
IMMEDIATE SOURCE: 

LIBRARY: LUNGNOT10 

CLONE: 1379718 
US-09-213-397-2 



Query Match 11.51; Score 105; DB 5; Length 702; 

Best Local Similarity 55,7%; Pred. No. 2.5e-19; 

Matches 201; Conservative 0; Mismatches 160; Indels 0; Gaps 

2y 28 tcaagtgtgtcacggtcggtgatggagctgtggggaaaacttgtatgctcatttcatata 87 

llll! Ill III II II II I lllll II I llll II I 
Db 97 TCAAGGTGGTCCTGGTGGGCGACGGCGGCTGCGGGAAGACGTCGCTGCTGATGGTCTTCG 156 

3y 88 ccagcaatactttcccaacggattatgttccaacagtatttgataactttagtgccaatg 147 

II I III!! I II I! II II lllll I I I III 
3b 157' CCGATGGGGCCTTCCCCGAGAGCTACACCCCCACGGTGTTTGAGCGGTACATGGTCAACC 216 

3y 148 tggtggtggatggcagcacagtgaaccttggcctatgggacactgccgggcaagaagatt 207 

II III I llll I III III! I I MIMM II lllll lllll I 
3b 217 TGCAAGTGAAAGGCAAACCTGTGCACCTCCACATGTGGGACACAGCAGGGCAGGAAGACT 276 

208 ataataggctaaggccactgagttatagaggagctgatgtgtttttgttggccttttctc 267 
II II I I! III! II II I lllllll I I II III II 
3b 277 ATGATCGACTGCGGCCTCTCTCCTACCCGGACACTGATGTCATCCTCATGTGCTTCTCCA 336 

3y 268 ttataagcaaggccagttatgaaaacatctacaaaaagtggatcccagagctaagacatt 327 
I III' I III 1 1 1 1 1 1 1 1 ! lllllll lllllll I I II I 
337 TCGACAGCCCTGACAGCCTGGAAAACATTCCTGAGAAGTGGACCCCAGAGGTGAAGCACT 396 



328 atgctcataatgtaccagttgtgcttgttggaaccaaactagatttgcgagatgacaagc 387 

I I! II II I I II II II III II II I I III III 
397 TCTGCCCCAACGTGCCCATCATCCTGGTGGGGAATAAGAAGGACCTGAGGCAAGACGAGC 456 

388 a 388 



Tue Sep 5 07:22:58 2000 



us-08-984-099-12.rni 



Page 4 



Db 



30 aagtgtgtcacggtcggtgatggagctgtggggaaaacttgtatgctcatttcatatacc 89 

III II III II II II II II II I! II Mil II I I 

19 aagctggtggtggtgggcgacggcgcgtgtggcaagacgtgcctgctgatcgtgttcagt 78 

90 agcaatactttcccaacggattatgttccaacagtatttgataactttagtgccaatgtg 149 

I I Mill II II II II II II II II Mil I III I I 

79 aaggacgagttccccgaggtgtacgtgcccaccgtcttcgagaactatgtggccgacatt 138 



Qy 150 gtggtggatggcagcacagtgaaccttggcctatgggacactgccgggcaagaagattat 209 

I llllll MM III I II I II llllllll II II II II II II 
Db 139 gaggtggacggcaagcaggtggagctggcgctgtgggacacggcgggccaggaggactac 198 

Qy 210 aataggctaaggccactgagttatagaggagctgatgtgtttttgttggccttttctctt 269 

I I II Mil II II I I II II II I II III II I 
0b 199 gaccgcctgcggccgctctcctacccggacaccgacgtcattctcatgtgcttctcggtg 258 

Qy 270 ataagcaaggccagttatgaaaacatctacaaaaagtggatcccagagctaagacattat 329 

III II I II llllll I I llllll Mil III I I II I 
Db 259 gacagcccggactcgctggagaacatccccgagaagtgggtccccgaggtgaagcacttc 318 

Qy 330 gctcataatgtaccagttgtgcttgttggaaccaaactagatttgcgagatgacaagcag 389 

• II Mill II I I II II I I MM III Mil III MM 
Db 319 tgtcccaatgtgcccatcatcctggtggccaacaaaaaagacctgcgcagcgacgagcat 378 

Qy 390 ttcctca'ttgatc 402 

III II II I 
Db 379 gtccgcacagagc 391 



RESULT 6 
US-08-055-797-1 
Sequence 1, Application OS/08055797 
Patent No. 5324830 
GENERAL INFORMATION: 
APPLICANT: RESNICR, MICHAEL A 
APPLICANT: CHOW, TERRY 
APPLICANT: PERKINS, ED 

TITLE OF INVENTION: A chimeric protein that has a human rho 
TITLE OF INVENTION: motif and deoxyribonuclease activity. 
NUMBER OF SEQUENCES: 2 



ADDRESSEE: CUSHMAN, DARBY S CUSHMAN 

STREET: Eleventh Floor, 1615 L. Street, N.w, 

CITY: Washington 

STATE: D.C. 

COUNTRY: U.S.A. 

ZIP: 20036-5601 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/055,797 

FILING DATE: 

CLASSIFICATION: 536 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US/07/674,801 

FILING DATE: 
ATTORNEY/AGENT INFORMATION: 

NAME: SCOTT, WATSON T 

REGISTRATION NUMBER: 26,581 

REFERENCE/DOCKET NUMBER: WTS/5683/83921/SRL 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (202) 861-3000 

TELEFAX: (202) 822-0944 

TELEX: 6714627CUSH 
INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 2282 base pairs 

TYPE: nucleic acid 

, STRANDEDNESS: double 



TOPOLOGY : 
FEATURE: 
NAME/KEY: 
LOCATION: 
US-08-055-797-1 



120.. 1574 



Query Match 11.8%; Score 107.4; DB 1; Length 2282; 

Best Local Similarity 58,3%; Pred. No. 8.9e-20; 

Matches 208; Conservative 0; Mismatches 146; Indels 3; Gaps 1; 

Qy 28 tcaagtgtgtcacggtcggtgatggagctgtggggaaaacttgtatgctcatttcatata 87 

I III III II II Mill Ml Mill II II MM II II Ml 

Db 334 TAAAGATTGTTGTTGTGGGAGATGGCGCTGTAGGGAAGACGTGCCTGCTGATATCTTATG 393 

Qy 88 ccagcaatactttcccaacggattatgttccaacagtatttgataactttagtgccaatg 147 

I II II II II llllll MM II I II II II I I I II 
Db 394 TCCAAGGAACATTTCCGACTGATTATATTCCTACTATTTTCGAAAATTATGTCACAAACA 453 

Qy 148 tggtggtggatggcagcacagtga---accttggcctatgggacactgccgggcaagaag 204 

III II III III IIIIIMIIIIIIIII Mill 
Db 454 TAGAAGGACCCAACGGTCAAATTATAGAATTGGCATTATGGGACACTGCCGGCCAAGAAG 513 

Qy 205 attataataggctaaggccactgagttatagaggagctgatgtgtttttgttggcctttt 264 

I MM III II II II II Mill II llllll I II II II II 

Db 514 AGTATAGTAGACTTAGACCGCTTTCATATAGGAATGCAGATGTGCTGATGGTGTGCTATT 573 

Qy 265 ctcttataagcaaggccagttatgaaaacatctacaaaaagtggatcccagagctaagac 324 

II II II III I I MM III III llllllll I I I 

Db 574 CTGTTGGTAGTAAGACATCGCTTAAAAATGTGGAAGATCTCTGGTTCCCAGAGGTTAAGC 633 

Qy 325 attatgctcataatgtaccagttgtgcttgttggaaccaaactagatttgcgagatg 381 
Db 634 ATTTTTGTCCTTCCACTCCAATCATGCTA^ 690 



RESULT 7 
US-07-914-284A-6 
Sequence 6, Application US/07914284A 
Patent No. 5489524 
GENERAL INFORMATION: 
APPLICANT : Chow, Terry Y.-R. 
APPLICANT: Resnick, Michael A. 
APPLICANT: Perkins, Edward 

TITLE OF INVENTION: A CHIMERIC PROTEIN THAT HAS A HUMAN RHO 

TITLE OF INVENTION: MOTIF AND DEOXYRIBONUCLEASE ACTIVITY 

NUMBER OF SEQUENCES: 9 

CORRESPONDENCE ADDRESS: 
ADDRESSEE: Knobbe, Martens, Olson s Bear 
STREET : 620 Newport Center Drive, Sixteenth Floor 
CITY: Newport Beach 

STATE: CA 

COUNTRY: USA 

ZIP: 92660 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE : Patentln Release #1,0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/07/914, 284A 

FILING DATE: 14-JUL-1992 

CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/674,801 

FILING DATE: 26-MAR-1991 
ATTORNEY/AGENT INFORMATION: 

NAME: Altman, Daniel E, 

REGISTRATION NUMBER: 34,115 

REFERENCE/DOCKET NUMBER: NIH022 . 022CP1 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (714) 760*0404 

TELEFAX: (714) 760-9502 
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Qy 376 gagatgacaagcagttcctcattgatcaccctggagcaa caccaatatcaacat 429 

I Mil llll II I I I II I I I 

Db 523 ggaatgatgagcacacaaggcgggagctagccaagatgaagcaggagccggtgaaacctg 582 

Qy 430 ctcagggagaagaactaaagaagatgataggagcagttacttatatagaatgcagctcca 489 

I II III' I , II I III II II II II II II II II 
Db 583 aagaaggcagagatatggcaaacaggattggcgcttttgggtacatggagtgttcagcaa 642 

Qy 490 aaacccaacagaatgtgaaggctgttttcgatgctgcaataaaagtagctttgaggccac 549 

I III II I llll I Mill II llll II III || I 
Db 643 agaccaaagatggagtgagagaggtttttgaaatggctacgagagctgctctgcaagcta 702 

Qy 550 caaaaccaaagagaaagccttg 571 

I llll III II I 
Db 703 gacgtgggaagaaaaaatctgg 724 



RESULT 2 
US-09-161-015-1 

; Sequence 1, Application US/09161015A 
; Patent No. 5965370 
; GENERAL INFORMATION: 

; applicant: Lex M. Cowsert 

; TITLE OF INVENTION: ANTISENSE MODULATION OF RhoG EXPRESSION 

; FILE REFERENCE: RTS-0015 

; CURRENT APPLICATION NUMBER: US/09/161, 015A 

; CURRENT FILING DATE: 1998-09-25 ■ 

; NUMBER OF SEQ ID NOS; 47 

; SEQ ID NO 1 

; LENGTH: 1284 

; TYPE: DNA 

; ORGANISM: Homo sapiens 

; FEATURE: 

; NAME/KEY: CDS 

; LOCATION: (130).. (705) 

US-09-161-015-1 



Query Match 14.9*; Score 135.2; DB 4; Length 1284; 

Best Local Similarity 55.7%; Pred, No. 2.7e-27; 

Matches 282; Conservative 0; Mismatches 218; Indels 6; Gaps 1; 

Qy 27 atcaagtgtgtcacggtcggtgatggagctgtggggaaaacttgtatgctcatttcatat 86 

1 1 1 1 1 1 1 1 II III llllllll llllllll II II II lllllll I II 
Db 139 atcaagtgcgtggtggtgggtgatggggctgtgggcaagacgtgcctgctcatctgctac 198 

Qy 87 accagcaatactttcccaacggattatgttccaacagtatttgataactttagtgccaat 146 

II I II lllllll I II II I II II II II II II I II II I 

Db 199 acaactaacgctttccccaaagagtacatccccaccgtgttcgacaattacagcgcgcag 258 

Qy 147 gtggtggtggatggcagcacagtgaaccttggcctatgggacactgccgggcaagaagat 206 

i ii ii ii illinium in minimi n n n n 

Db 259 agcgcagttgacgggcgcacagtgaacctgaacctgtgggacactgcgggccaggaggag 318 

Qy 207 tataataggctaaggccactgagttatagaggagctgatgtgtttttgttggccttttct 266 

III I I II I llll II I I II II I I II II 

Db 319 tatgaccgcctccgtacactctcctaccctcagaccaacgttttcgtcatctgtttctcc 378 

Qy 267 cttataagcaaggccagttatgaaaacatctacaaaaagtggatcccagagctaagacat 326 

ii ii ii inn iii i i nun mm i i n 

Db 379 attgccagtccgccgtcctatgagaacgtgcggcacaagtggcatccagaggtgtgccac 438 

Qy 327 tatgctcataatgtaccagttgtgcttgttggaaccaaactagatttgcgag at 380 

I I I llll II I llll II II lllll II II III I 
Db 439 cactgccctgatgtgcccatcctgctggtgggcaccaagaaggacctgagagcccagcct 498 

Qy 381 gacaagcagttcctcattgatcaccctggagcaacaccaatatcaacatctcagggagaa 440 

llll I lllllll I II II II I lllll I 
Db 499 gacaccctacggcgcctcaaggagcagagccaggcgcccatcacaccgcagcagggccag 558 

Qy 441 gaactaaagaagatgataggagcagttacttatatagaatgcagctccaaaacccaacag 500 

I III III III II II II I llllll II II I 
Db 559 gcactcgcgaaacagatccacgctgtgcgctacctcgaatgctcagccctgcaacaggat 618 



Qy 501 aatgtgaaggctgttttcgatgctgc 526 

III llll II llll I II 
Db 619 ggtgtcaaggaagtgttcgccgaggc 644 



RESULT 3 
US-08-846-790A-2 

; Sequence 2, Application US/08846790A 

; Patent No. 5973130 

; Patent No. 5973130 5840864 

; GENERAL INFORMATION: 

; APPLICANT: Hillman, Jennifer L. 

; APPLICANT: Cor ley, Neil C. 

; APPLICANT: Shah, Purvi 

; TITLE OF INVENTION: RAS-LIKE PROTEIN 

; NUMBER OF SEQUENCES: 3 

; CORRESPONDENCE ADDRESS: 

ADDRESSEE: Incyte Pharmaceuticals, Inc. 

STREET: 3174 Porter Drive 

CITY: Palo Alto 

STATE: CA 

COUNTRY: USA 

ZIP: 94304 
; COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 

COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ for Windows Version 2.0 
; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/846, 790A 

FILING DATE: Herewith 

CLASSIFICATION: 514 
; PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 

FILING DATE: 
; ATTORNEY/AGENT INFORMATION: 

NAME: Billings, Lucy J. 

REGISTRATION NUMBER: 36,749 

REFERENCE/DOCKET NUMBER: PF-0388 US 
; TELECOMMUNICATION INFORMATION: 

TELEPHONE: 650-855-0555 

TELEFAX: 650-845-4166 

TELEX: 

; INFORMATION FOR SEQ ID NO: 2: 
SEQUENCE CHARACTERISTICS : 

LENGTH: 2964 base pairs 

TYPE: nucleic acid 

STRAKDEDNESS: single 

TOPOLOGY: linear 
; IMMEDIATE SOURCE: ■ 

LIBRARY: COLNTUT16 

CLONE: 2791521 
US-0B-846-790A-2 



Query Match 14.8*; Score 134.4; DB 4; Length 2964; 

Best Local Similarity 54.4%; Pred. No. 6e-27; 

Matches 295; Conservative 0; Mismatches 241; indels 6;_ Gaps 

Qy 28 tcaagtgtgtcacggtcggtgatggagctgtggggaaaacttgtatgctcatttcatata 87 

i iimiii m ii ii ii ii minimi n mi n n 

Db 471 TGAAGTGTGTGGTGGTGGGGGACGGTGCCGTGGGGAAAACCTGCCTGCTGATGAGCTACG 530 

Qy 88 ccagcaatactttcccaacggattatgttccaacagtatttgataactttagtgccaatg 147 

III I I I llllll III II II II II II lllll llll II II 
Db 531 CCAACGACGCCTTCCCAGAGGAATACGTGCCCACTGTGTTTGACCACTATGCAGTTACTG 590 

Qy 148 tggtggtggatggcagcacagtgaaccttggcctatgggacactgccgggcaagaagatt 207 

II llll llll I II II II I lllll II II II II II I 
Db 591 TGACTGTGGGAGGCAAGCAACACTTGCTCGGACTGTATGACACCGCGGGACAGGAGGACT 650 

Qy 208 ataataggctaaggccactgagttatagaggagctgatgtgtttttgttggccttttctc 267 
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inn in in n n n i inn ii i mi ii i 



Qy 88 ccagcaatactttcccaacggattatgttccaacagtatttgataactttagtgccaatg 147 

II I lllll I II II II II Mill I I I III 

Db 152 CCGATGGGGCCTTCCCCGAGAGCTACACCCCCACGGTGTTTGAGCGGTACATGGTCAACC 211 

Qy 148 tggtggtggatggcagcacagtgaaccttggcctatgggacactgccgggcaagaagatt 207 

ii iii i mi i m mi 1 1 Mini! ii iinmi n i 

Db 212 TGCAAGTGAAAGGCAAACCTGTGCACCTCCACATCTGGGACACAGCAGGGCAAGATGACT 271 

Qy 208 ataataggctaaggccactgagttatagaggagctgatgtgtttttgttggccttttctc 267 

III I II MM III II I II II I II I III I 

Db 272 ATGACCGCCTGCGGCCCCTGTTCTACCCTGACGCCAGCGTCCTGCTGCTTTGCTTCGATG 331 

Qy 268 ttataagcaaggccagttatgaaaacatctacaaaaagtggatcccagagctaagacatt 327 

I I III I III I III lllllll II MM MUM I I MM 
Db 332 TCACCAGCCCGAACAGCTTTGACAACATCTTTAACCGGTGGTACCCAGAAGTGAATCATT 391 

Qy 328 atgctcataatgtaccagttgtgcttgttggaaccaaactagatttgcgagatgacaa 385 

I II lllll I I I II II III II MM I Mill 
Db 392 TCTGCAAGAAGGTACCCATCATCGTCGTGGGCTGCAAGACTGACCTGCGCAAGGACAA 449 



Search completed; September 3, 2000, 03:10:10 
Job time: 28556 sec 
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CC dermatomyositis, diabetes mellitus, emphysema, atrophic gastritis, 

CC glomerulonephritis/ gout, Graves' disease, hypereos inophil ia , irritable 

CC bowel syndrome, lupus erythematosus, multiple sclerosis, myasthenia 

CC gravis, myocardial or pericardial inflammation, osteoarthritis, 

CC osteoporosis, pancreatitis, polymyositis, rheumatoid arthritis, 

CC scleroderma, Sjogren's syndrome, and autoimmune thyroiditis, They can 

CC also be used to treat complications of cancer, haemodialysis, and 

CC extracorporeal circulation, viral, bacterial, fungal, parasitic, 

CC protozoal, and helminthic infections, and trauma, The products can also 

CC be used to treat diseases associated with apoptosis, such as AIDS and 

CC other infectious or genetic immunodeficiencies, neurodegenerative 

CC diseases such as Alzheimer's disease, amyotrophic lateral sclerosis, 

CC Parkinson's disease, retinitis pipentosa and cerebellar degeneration, 

CC myelodysplastic syndromes such as aplastic anaemia, ischaemic injuries 

CC such as myocardial infarction, stroke and reperfusion injury, toxin- 

CC induced diseases such as cachexia, viral infections such as those caused 

CC by hepatitis B and C and osteoporosis, 

SO Sequence 2966 BP; 822 A; 715 C; 641 G; 786 T; 



Query Match 14,8*; Score 134,4; DB 1; Length 2966; 

Best Local Similarity 54.4%; Pred, No, 1.4e-24; 

Matches 295; Conservative 0; Mismatches 241; Indels 6; Gaps. 

Qy 28 tcaagtgtgtcacggtcggtgatggagctgtggggaaaacttgtatgctcatttcatata 87 

I llllllll III II II II II Mlllllllll II MM II II 

Db 473 TGAAGTGTGTGGTGGTGGGGGACGGTGCCGTGGGGAAAACCTGCCTGCTGATGAGCTACG 532 

Qy 88 ccagcaatactttcccaacggattatgttccaacagtatttgataactttagtgccaatg 147 

III I I I MMII III II II II II II Mill III I II II 
Db 533 CCAACGACGCCTTCCCAGAGGAATACGTGCCCACTGTGTTTGACCACTATGCAGTTACTG 592 

Qy 148 tggtggtggatggcagcacagtgaaccttggcctatgggacactgccgggcaagaagatt 207 

II MM II 1 1 I II II II I Mill II II II M II I 

Db 593 TGACTGTGGGAGGCAAGCAACACTTGCTCGGACTGTATGACACCGCGGGACAGGAGGACT 652 

Qy 208 ataataggctaaggccactgagttatagaggagctgatgtgtttttgttggccttttctc 267 

I II Ml llllllll II I IIIMMIMM I Ml III 
Db 653 ACAACCAGCTGAGGCCACTCTCCTACCCCAACACGGATGTGTTTTTGATCTGCTTCTCTG 712 

Qy 268 ttataagcaaggccagttatgaaaacatctacaaaaagtggatcccagagctaagacatt 327 

I III I III III I II II I II III MM Mill I I I 
Db 713 TCGTAAACCCTGCCTCTTACCACAATGTCCAGGAGGAATGGGTCCCCGAGCTCAAGGACT 772 

Qy 328 atgctcataatgtaccagttgtgcttgttggaaccaaactagatttgcgagatgac-— 383 

I II II II III II I II III I I III I II MMII 
Db 773 GCATGCCTCACGTGCCTTATGTCCTCATAGGGACCCAGATTGATCTCCGTGATGACCCAA 832 

Qy 384 "aagcagttcctcattgatcaccctggagcaacaccaatatcaacatctcagggagaag 441 

I I II I I II I III I I II II I 
Db 833 AAACCTTGGCCCGTTTGCTGTATATGAAAGAGAAACCTCTCACTTACGAGCATGGTGTGA 892 

Qy 442 aactaaagaagatgataggagcagttacttatatagaatgcagctccaaaacccaacaga 501 

I II II III MMII II I Mill I Mill 
Db 893 AGCTCGCAAAAGCGATCGGAGCACAGTGCTACTTGGAATGTTCAGCTCTGACTCAGAAAG 952 

Qy 502 atgtgaaggctgttttcgatgctgcaataaaagtagctttgaggccaccaaaaccaaaga 561 

I I II II IIMI MM Mill III II III Mil 

Db 953 GTCTCAAAGCGGTTTTTGATGAAGCAATCCTCACCATTTTCCACCCCAAGAAAAAGAAGA 1012 

Qy 562 ga 563 

I 

Db 1013 AA 1014 ' 



RESULT 12 
Q15017 

ID Q15017 standard; DNA; 2282 BP. 
AC Q15017; 

DT 25-FEB-1992 (first entry) 

DE Encodes yeast endo-exonuclease RhoNUC, 

KW yeast cell cycle; rho/ras oncogene-like motif; RNCl gene; ss. 

OS _Saccharomyces cerevisiae. 



FH Key Location/Qualifiers 

FT cds 120. .1577 

FT /*tag- a 

PN US7674801-A. 

PD 05 -NOV- 1991. 

PF 05-NOV-1991; 674801. 

PR 26-MAR-1991; US-674801. 

PA (USSH ) US DEPT HEALTH S HUMAN. 

PI Resnick MA, Chow T, Perkins E; 

DR WPI; 91-361692/49. 

DR P-PSDB; R15343, 

PT Recombinant RhoNUC - useful for characterising agents to modify 

PT cellular growth 

PS Disclosure; Fig 2; 43pp; English. 

CC The RNCl gene was isolated from a yeast genomic library. It is 

CC predicted to encode a protein of mol.wt. 57kD; the observed mol.wt. 

CC is 72kD and the difference is thought to be due to glycosylation. 

CC The N-terminal region of the deduced amino acid sequence shows 

CC considerable homology with mammalian rho genes which are related to 

CC ras oncogenes. The deduced C-terminal sequence has homology with 

CC E.coli recc. 

SQ Sequence 2282 BP; 770 A; 439 C; 435 G; 638 T; 

Query Match 11,8*; Score 107,4; DB 1; Length 2282; ^' • 

Best Local Similarity 58 . 3%; Pred, No. 7e-18; 

Matches 208; Conservative 0; Mismatches 146; Indels 3; Gaps 

Qy 28 tcaagtgtgtcacggtcggtgatggagctgtggggaaaacttgtatgctcatttcatata 87 

I III Ml II II Mill Mill Mill II II MM II M III 
Db 334 TAAAGATTGTTGTTGTGGGAGATGGCGCTGTAGGGAAGACGTGCCTGCTGATATCTTATG 393 

Qy 88 ccagcaatactttcccaacggattatgttccaacagtatttgataactttagtgccaatg 147 

I II II II II MMII I II I II I II II II I I I II 
Db 394 TCCAAGGAACATTTCCGACTGATTATATTCCTACTATTTTCGAAAATTATGTCACAAACA 453 

Qy 148 tggtggtggatggcagcacagtga---accttggcctatgggacactgccgggcaagaag 204 

I I I II III III MIMMMIIIMM iiiiii-f 

Db 454 TAGAAGGACCCAACGGTCAAATTATAGAATTGGCATTATGGGACACTGCCGGCCAAGAAG 513 

Qy 205 attataataggctaaggccactgagttatagaggagctgatgtgtttttgttggcctttt 264 

I MM III II II INI Mill II MMII I II II Mir 
Db 514 AGTATAGTAGACTTAGACCGCTTTCATATAGGAATGCAGATGTGCTGATGGTGTGCTATT 573 

Qy 265 ctcttataagcaaggccagttatgaaaacatctacaaaaagtggatcccagagctaagac 324 

INI II Ml I I MM III III llllllll I I I 
Db 574 CTGTTGGTAGTAAGACATCGCTTAAAAATGTGGAAGATCTCTGGTTCCCAGAGGTTAAGC 633 

Qy 325 attatgctcataatgtaccagttgtgcttgttggaaccaaactagatttgcgagatg 381 

III I II I III I MM 1 1 II III Mil I II I 
Db 634 ATTTTTGTCCTTCCACTCCAATCATGCTAGTCGGCCTTAAATCAGATCTATATGAAG 690 



RESULT 13 
V68232 

ID V68232 standard; cDNA; 702 BP, 
AC V68232; 

DT 16-FEB-1999 (first entry) 

DE Nucleotide sequence encoding human Rho. 

KW ss; human; Rho protein; cell proliferation; inflammation; 

KW transplantation; cancer; gene therapy, 

OS Homo sapiens, 

Location/Qualifiers 
45. ,663 



Key 
CDS 



/♦tag- a 
/product- 



W09846754-A1. 
22-OCT-1998. 
' 16-APR-1998; U07865. 
17-APR-1997; US-842976. 
(INCY-) INCITE PHARM INC, 
Goli SK, Hillman JL; 
WPI; 98-609916/51. 



'Human Rho" 
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DE Candida CaCdc42 gene. 

KW GTPase; GGPTase; geranylgeranyl transferase; fungal Rho-like GTPase; 

KW antifungal agent identification; mycosis; feed additive; disinfectant; 

KW therapy; Candida cell detection; cell wall integrity; hyphal formation; 

KW pathogenesis; Candida CaCdc42 gene; ds, 

OS Candida sp. 

FH Key Location/Qualifiers 

FT CDS 260, .835 

FT /*tag- a 

PN W09738293-A2. • • 

PD 16-OCT-1997. 

PF ll-APR-1997; U05987 . 

PR 20-DEC-1996; US-771212. 

PR ll-APR-1996; US-631319. 

PA (MITO-) MITOTIX INC. 

PA (UYJO ) UNIV JOHNS HOPKINS. 

DR WPI; 97-512864/47. 

DR P-PSDB; W33897. 

PT Identification of antifungal agents that inhibit GTPase ■ useful for 

PT specific detection of Candida 

PS Claim 115; Page 84-85; llBpp; English. 

CC This sequence represents the Candida CaCdc42 gene. The encoded protein is 

CC a fungal Rho-like GTPase. The encoded protein can be used in an assay of 

CC the invention. The method of the invention is for identifying potential 

CC antifungal agents (I), and comprises: (a) mixing a fungal geranylgeranyl 

CC transferase (GGPTase), a GGPTase substrate GGPTase, and test compound; 

CC and (b) detecting interaction between GGPTase and GGPTase; a significant 

CC reduction in this interaction indicates that the test compound is a (I). 

CC (I) are useful for treating mycoses in humans or animals; as feed 

CC additives and as disinfectants. This sequence, and MAb specifically 

CC reactive with the encoded protein are used to detect Candida cells 

CC specifically, particularly in cells, tissues and body fluids, while 

CC antisense sequences are used to inhibit expression of these genes. The 

CC method is a rapid, reliable and effective way of detecting agents that 

CC inhibit GTPases, particularly those involved in cell wall integrity, 

CC formation of hyphae and/or other cellular functions necessary for 

CC pathogenesis. (I) should be selective for fungal cells, with little 

CC effect on mammalian cells'. 

SQ Sequence 934 BP; 290 A; 152 C; 157 G; 335 T; 



Query Match 16.4%; Score 149.2; DB 1; Length 934; 

Best Local Similarity 56, It; Pred. No, 2e-28; 

Matches 305; Conservative 0; Mismatches 233; Indels 6; Gaps 1; 

Qy 26 tatcaagtgtgtcacggtcggtgatggagctgtggggaaaacttgtatgctcatttcata 85 

Db 268 TATAAAAT^ 327 

Qy 86 taccagcaatactttcccaacggattatgttccaacagtatttgataactttagtgccaa 145 

Mill I II II III I i Ml II II MINI I I II I 

Db 328 TACCACTAGTAAATTTCCAGCTGATTATGTTCCTACTGTTTTTGATAATTATGCTGTAAC 387 

Qy 146 tgtggtggtggatggcagcacagtgaaccttggcctatgggacactgccgggcaagaaga 205 

III II I I I I II I I I I II III II Hill II I II III II 
Db 388 CGTGATGATAGGAGACGAACCATTTACCTTGGGATTATTTGATACTGCTGGTCAAGAAGA 447 

Qy 206 ttataataggctaaggccactgagttatagaggagctgatgtgtttttgttggccttttc 265 

III I II 1111111- II III I MINI II I I Mill 
Db 448 TTACGACAGATTAAGGCCTTTGTCATATCCATCGACTGATGTATTCCTTGTTTGTTTTTC 507 

Qy 266 tcttataagcaaggccagttatgaaaacatctacaaaaagtggatcccagagctaagaca 325 

Db 508 CGTCATTTCTCCTGCTTCGTTTGAW^ 567 

Qy 326 ttatgctcataatgtaccagttgtgcttgttggaaccaaactagatttgcgagatgacaa 385 

II II III III I I llll II III II Mill III I II I 
Db 568 CCATTGTCCCGGTGTGCCAATAATTATTGTCGGTACCCAAACTGATTTACGAAACGATGA 627 

Qy 386 gcagttcctcattgatcaccctggagcaa caccaatatcaacatctcagggaga 439 

II I II II I lllll I Hill II 

Db 628 TGTTATTTTACAGAGATTGCACAGACAAAAATTGTCCCCAATCACCCAGGAACAGGGTGA 687 



Qy 440 agaactaaagaagatgataggagcagttacttatatagaatgcagctccaaaacccaaca 499 

I II I III I llll II I III I II II I II III 
Db 688, AAAATTGGCTAAGGAATTGAGAGCTGTCAAGTATGTTGAGTGTTCTGCATTGACTCAAAG 747 

Qy 500 gaatgtgaaggctgttttcgatgctgcaataaaagtagctttgaggccaccaaaaccaaa 559 

llll I II II II I II III II II II II II I II 
Db 748 AGGATTGAAAACAGTGTTTGACGAGGCTATAGTAGCTGCATTAGAACCTCCTGTAATTAA 807 

Qy 560 gaga 563 

I I 

Db 808 AAAA 811 



RESULT 9 
V32555 

ID V32555 standard; RNA; 3243 BP. 

AC V32555; 

DT 13-OCT-1998 (first entry) 

DE Candida albicans CaCdc42p gene. 

KW CaCdc42p; G-protein; rho family; screening; virulence; 

KW hyphal formation; pathogenic fungi; inhibitor; inflammation; 

KW antimycotic; ss. 

OS Candida albicans . 

FH Key Location/Qualifiers 

FT CDS 271. .846 

FT /*tag= a 

FT /product- CaCdc42p protein 

PN W09818927-A1. 

PD 07-MAM998. 

PF 29-OCT-1997; CA0809. 

PR 30-OCT-1996; US-029458. 

PA (CANA ) NAT RES COUNCIL CANADA. 

PI Leberer E, Thomas DY; 

DR WPI; 98-272222/24. 

DR P-PSDB; W48897. 

PT In vitro screening test for agents that inhibit Candida genes 

PT involved in virulence - and transition to hyphal form, potentially 

PT useful as antimycotic agents I. 

PS Disclosure; Fig 11; 79pp; English. 

CC The sequence is that encoding the CaCdc42p protein which can be used 
CC in the development of an in vitro screening test for compounds 
CC that inhibit biological activity of the protein and a system for- 
ce measuring its activity. The protein is involved in virulence and 
CC hyphal formation. Inhibitors are potentially useful for rendering 
CC pathogenic fungi (any species in which hyphal induction by kinase 
CC occurs) avirulent and/or to treat inflammation, The coding sequence 
CC can be used as source of probes for detecting C. albicans in 
CC amplification or hybridisation assays, also to identify and 
CC clone homologous genes from other fungi. 
SQ Sequence 3243 BP; 1185 A; 541 C; 456 G; 1061 T; 



Query Match 16.4%; Score 149,2; DB 1; Length 3243; 

Best Local Similarity 56.1%; Pred. No. 3e-28; 

Matches 305; Conservative 0; Mismatches 233; Indels 6; Gaps 

Qy 26 tatcaagtgtgtcacggtcggtgatggagctgtggggaaaacttgtatgctcatttcata 85 

in ii mil immiiii n n n mum i i n n n 

Db 279 TATAAAATGTGTTGTTGTCGGTGATGGTGCCGTTGGTAAAACTTGCTTATTAATCTCGTA 338 
Qy 86 taccagcaatactttcccaacggattatgttccaacagtatttgataactttagtgccaa 145 

mil i ii ii iii i milium n n iiiimi i i n i 

Db 339 TACCACTAGTAAATTTCCAGCTGATTATGTTCCTACTGTTTTTGATAATTATGCTGTAAC 398 
Qy 146 tgtggtggtggatggcagcacagtgaaccttggcctatgggacactgccgggcaagaaga 205 

iii ii i i ii ii i i 1 1 ii iii ii mn ii mum 

Db 399 CGTGATGATAGGAGACGAACCATTTACCTTGGGATTATTTGATACTGCTGGTCAAGAAGA 458 

Qy 206 ttataataggctaaggccactgagttatagaggagctgatgtgtttttgttggccttttc 265 

III I II I II 1 1 1 1 II III I 1 1 1 1 1 1 1 II I I lllll 
Db 459 TTACGACAGATTAAGGCCTTTGTCATATCCATCGACTGATGTATTCCTTGTTTGTTTTTC 518 

Qy 266 tcttataagcaaggccagttatgaaaacatctacaaaaagtggatcccagagctaagaca 325 
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4 3 0 ctcagggagaagaactaaagaagatgataggagcagttacttatatagaa tgcagctcca 4 8 9 

II I II II II I II II Mil I I III I Mill I I 
550 CTGAAGGCCAACAAGTTGCTCAAAGAATTGGTGCTGCTGATTACTTGGAATGTTCTGCTA 609 

4 9 0 aaacccaacagaatgtgaaggctgttttcgatgctgcaataaaagtagctttgaggccac 549 

Mill III I I II II II Mill I I II Mil II 
610 AAACCGGTAGAGGTGTTAGAGAAGTGTTTGMGCTGCTACTAGAGCTICTTTAAGAGTTA 669 

550' caaaaccaaagagaaag 566 

i ii iii mi 

670 AAGAAAAGAAGGAAAAG 686 



RESULT 5 
T92702 

ID T92702 standard; cDNA; 3198 BP, 

AC T92702; 

DT 30-APR-1998 (first entry) 

DE Candida CaRhol gene. 

KW GTPase; GGPTase; geranylgeranyl transferase; fungal Rho-like GTPase; 

KW antifungal agent identification; mycosis; feed additive; disinfectant; 

KW therapy; Candida cell detection; cell wall integrity; hyphal formation; 

KW pathogenesis; Candida CaRhol gene; ds. 

OS Candida sp. 

PH Key Location/Qualifiers 

FT CDS 1362. .1959 

FT /*tag- a 

PN W09738129-A1. 

PD 16-OCH997. 

PF 10-APR-1997; D05929. 

PR 10-APR-1996; US-631319. 

PA (MITO-) MITOTIX INC. 

PA (UYJO ) UNIV JOHNS HOPKINS. 

PI Berlin V, Damagnez V, Smith SE; 

DR WPI; 97-512735/47. 

DR P-PSDB; W30379. 

PT Identification of antifungal agents that inhibit GTPase ■ useful for 

PT specific detection of Candida 

PS Claim 118; Fig 16; 123pp; English. 

CC This sequence represents the Candida CaRhol gene, The encoded protein is 

CC a fungal Rho-like GTPase. The encoded protein can be used in an assay of 

CC the invention. The method of the invention is for identifying potential 

CC antifungal agents (I), and comprises: (a) mixing a fungal geranylgeranyl 

CC transferase (GGPTase), a GGPTase substrate (A), and test compound; 

CC and (b) detecting interaction between GGPTase and (A); a significant 

CC reduction in this interaction indicates that the test compound is a (I), 

CC (I) are useful for treating mycoses in humans or animals; as feed 

CC additives and as disinfectants. This sequence, and MAb specifically 

CC reactive with the encoded protein are used to detect Candida cells 

CC specifically, particularly in cells, tissues and body fluids, while 

CC antisense sequences are used to inhibit expression of these genes. The 

CC method is a rapid, reliable and effective way of detecting agents that 

CC inhibit GTPases, particularly those involved in cell wall integrity, 

CC formation of hyphae and/or other cellular functions necessary for 

CC pathogenesis. (I) should be selective for fungal cells, with little 

CC effect on mammalian cells . 

SQ Sequence 3198 BP; 1057 A; 538 C; 517 G; 1086 T; 



Query Match 18.0%; Score 163,8; DB 1; Length 3198; 

Best Local Similarity 57.3%; Pred. No. 6.8e-32; 

Matches 319; Conservative 0; Mismatches 232; Indels 6; Gaps 

Qy 16 ctgcaagatttatcaagtgtgtcacggtcggtgatggagctgtggggaaaacttgtatgc 75 ■ 

iii i ii ii i mi minimi in n n mill i 

Db 1378 CTGAACTTCGTAGAAAATTAGTCATTGTCGGTGATGGTGCTTGTGGTAAGACTTGTTTAT 1437 

Qy 76 tcatttcatataccagcaatactttcccaacggattatgttccaacagtatttgataact 135 

I III I I II llllllllll I llllll MINN Mill II I 
Db 1438 TAATTGTTTTTTCAAAAGGTACTTTCCCAGAAGTTTATGTCCCAACAGTTTTTGAAAATT 1497 

Qy 136 ttagtgccaatgtggtggtggatggcagcacagtgaaccttggcctatgggacactgccg 195 .. 



Ill llll I II Mill III III I II llllll lllll I 

Db 1498 ACGTTGCTGATGTTGAAGTTGATGGTAGAAAAGTTGAATTGGCATTATGGGATACTGCTG 1557 
Qy 196 ggcaagaagattataataggctaaggccactgagttatagaggagctgatgtgtttttgt 255 

i illinium iiii iiii m i nn n . n mi inn 

Db 1558 GTCAAGAAGATTATGATAGATTAAGACCATTATCTTATCCAGATTCTAATGTTATTTTGA 1617 

Qy 256 tggccttttctcttataagcaaggccagttatgaaaacatctacaaaaagtggatcccag 315 

I Mill II I II II III I I IIII lllll I I 

Db 1618 TTTGTTTTTCAGTTGATTCACCAGATTCTTTAGATAACGTTTTAGAAAAATGGATTTCTG 1677 

Qy 316 agctaagacattatgctcataatgtaccagttgtgcttgttggaaccaaactagatttgc 375 

i I iiiii iii m iii i i i inn iii inn 

Db 1678 AAGTTTTACATTTCTGTCAAGGTGTTCCAATCATTTTAGTTGGTTGTAAATCTGATTTAA 1737 

Qy 376 gagatgacaagcagttcct cattgatcaccctggagcaacaccaatatcaacat 429 

lllllll II I I I II I I I IIII I lllll I 
Db 1738 GAGATGATCCTCATACTATTGAAGCCTTGAGACAACAACAACAACAACCAGTCTCAACTT 1797 

Qy 430 ctcagggagaagaactaaagaagatgataggagcagttacttatatagaatgcagctcca 489 

II I II II II I II II II II I I III I lllll I I 

Db 1798 CTGAAGGCCAACAAGTTGCTCAAAGAATTGGTGCTGCTGATTACTTGGAATGTTCTGCTA 1857 

Qy 490 aaacccaacagaatgtgaaggctgttttcgatgctgcaataaaagtagctttgaggccac 549 

lllll III I I II II II Mill I I II IIII II 
Db 1858 AAACCGGTAGAGGTGTTAGAGAAGTGTTTGAAGCTGCTACTAGAGCTTCTTTAAGAGTTA 1917 

Qy 550 caaaaccaaagagaaag 566 

I II III IIII 
Db 1918 AAGAAAAGAAGGAAAAG 1934 



RESULT 
T92869 



: 3198 E 



T92869 standard; cDNA; 
T92869; 

30-APR-1998 (first entry) 

Candida CaRhol gene. " 
GTPase; GGPTase; geranylgeranyl transferase; fungal Rho-like GTPase; 
antifungal agent identification; mycosis; feed additive; disinfectant; 
therapy; Candida cell detection; cell wall integrity; hyphal formation; 
pathogenesis; Candida CaRhol gene; ds. 
Candida sp. 

Key Location/Qualifiers 
CDS 1362. .1959 

/♦tag- a 

W09738293-A2. 
16-OCT-1997 . 
ll-APR-1997; U05987. 
20-DEC-1996; DS-771212. 
ll-APR-1996; OS-631319. 
(MITO-) MITOTIX INC. 
(UYJO ) UNIV JOHNS HOPKINS. 
WPI; 97-512864/47. 
P-PSDB; W33896. 

Identification of antifungal agents that inhibit GTPase ■ useful for 
specific detection of Candida 
Claim 118; Fig 16; 118pp; English. 

This sequence represents the Candida CaRhol gene. The encoded protein is 
a fungal Rho-like GTPase. The encoded protein can be used in an assay of 
the invention. The method of the invention is for identifying potential 
antifungal agents (I), and comprises: (a) mixing a fungal geranylgeranyl 
transferase (GGPTase), a GGPTase substrate GGPTase, and test compound; 
and (b) detecting interaction between GGPTase and GGPTase; a significant 
reduction in this interaction indicates that the test compound is a (I). 
(I) are useful for treating mycoses in humans or animals; as feed 
additives and as disinfectants. This sequence, and MAb specifically 
reactive with the encoded protein are used to detect Candida cells 
specifically, particularly in cells, tissues and body fluids, while 
antisense sequences are used to inhibit expression of these genes . The 
method is a rapid, reliable and effective way of detecting agents that 
inhibit GTPases, particularly those involved in cell wall integrity, 
formation of hyphae and/or other cellular functions necessary for 
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Db 


121 


Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db 


301 


Qy 


361 


Db 


361 


Qy 


421 


Db 


421 


Qy 


481 


Db 


481 


Qy 


541 


Db 


541 


Qy 


601 


Db 


601 


Qy 


661 


Db 


661 


Qy 


721 


Db 


721 


Qy 


781 


Db 


781 


Qy 


841 


Db 


841 


Qy 


901 


Db 


901 



IIIIIIIIIIIIIIIIIIMIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIII 



imiiiiimiiMiiMiiiMiMiMiiiiiimiMimiiiiimimi 



IIIIIIIIIIIIIIIIIIIIMIIIIIIMIIMIIIIIIIIIIIIMIMIIIIIIIII 



M 1 1 1 1 1 1 i 1 1 1 1 1 1 1 [ I J M I [ 1 1 1 1 1 1 1 M M 1 1 1 j r 1 1 1 1 1 F 1 1 ] 1 1 



IIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIMIIIIIIIIIIIIIIIMIIII 



llllllllimillMIMIIIIIIIIIIIIIIIIIIIIIIIIimillllllllMI 



imiiimiiiiiiiiiiiiiiiimmiiiiimiimiiiiiiiimmi 



iiiiiiiiiimiiiiiiiiiiiiiiiiiiiiiiiiiMiimiiiiiiiiMiMi! 



iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii 



IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIII 



1 1 4 F 1 1 1 ! 1 1 1 1 1 1 r 1 1 i 1 1 1 1 1 1 M 1 1 ! I E 1 1 ! I [ I M I ! I [ 1 1 1 ! i I ! 1 1 j I M I f 1 1 



llllllllll 



RESULT 2 
T73866 



T73866 standard; DNA; 3045 BP. 
T73866; 

26-JAN-1998 (first entry) 

Cotton fibre promoter clone Racl3 construct, pCGN4 735. 

promoter; fibre-specific; transcriptional factor; promoter; 

altered phenotype; colour; melanin; indigo; ss. 

Gossypium hirsutum cv. coker 130, 

WO9640924-A2. 

19-DEC-1996. 

07-JON-1996; O09897. 

07-JUN-1995; 0S-48O178, 

01-JUL-1996; ZA-005572. 

(CALJ ) CALGENE INC. 

Mcbride K, Pear JR, Perez-Grau L, Stalker DM; 
WPJ; 97-052325/05. 



PT DNA construct contg. gene of interest controlled by cotton fibre 

PT transcriptional factor - used to produce altered phenotype cotton 

PT fibre cells expressing genes affecting pipentation 

PS Claim 23; Fig 5A-E; 95pp; English. 

CC The present sequence is the Racl3 promoter construct, pCGN4735, isolated 

CC from cotton fibre genomic clone 15-1. DNA constructs containing 

CC cotton fibre-specific transcriptional factor promoters are useful to 

CC produce cotton fibre cells with altered phenotype, especially altered 

CC colour. Genes involved in the production of melanin (e.g. tyrosinase 

CC gene and ORF438 encoded protein from Streptomyces antibioticus) and 

CC indigo (mono-oxygenase genes possibly in conjunction with a 

CC tryptophanase gene) are of interest. The promoters of the invention are 

CC reliable and permit expression of a protein selectively in cotton fibre 

CC to affect qualities such as fibre strength, length, colour and dyability 

CC as required. The construct and methods can also be used for the 

CC introduction of other advantageous genes into a cotton plant, e.g. a 

CC plant hormone. In particular, fibres from a plant producing coloured 

CC fibres may be used to produce yarns and/or fabrics that do not require 

CC dyeing. 

SQ Sequence 3045 BP; 1063 A; 450 C; 366 G; 1162 T; 



Query Match 33.5%; Score 304.8; DB 1; Length 3045; 

Best Local Similarity 97,8%; Pred. No. 4.6e-67; 

Conservative 0; Mismatches 7; Indels 0; 



IMIIIIIIIIMIllMlllllllllllllilllllllllllllllllllllllllll 



iiiiiiiiimmiiiimiimiiiimiiiiiiiiiiiiiimiiiiiiii! 



IIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIII 



1 1 1 1 1 1 1 1 1 1 1 1 m m m i ii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 m- 



iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiniiiiiiiii 



Matches 


Qy 


594 


Db 


1806 


Qy 


654 


Db 


1866 


Qy 


714 


Db 


1926 


Qy 


774 


Db 


1986 


Qy 


834 


Db 


2046 


Qy. 


894 


Db 


2106 



inn i i i ii 



RESULT 3 
T92698 

ID T92698 standard; cDNA; 985 BP. 

AC T92698; 

DT 30-APR-1998 (first entry) 

DE Candida CaRhol gene. 

KW GTPase; GGPTase; geranylgeranyl transferase; fungal Rho-like GTPase; 

KW antifungal agent identification; mycosis; feed additive; disinfectant; 

KW therapy; Candida cell detection; cell wall integrity; hyphal formation; 

KW pathogenesis; Candida CaRhol gene; ds. 

OS Candida sp. 

PH Key Location/Qualifiers 

PT CDS 114. .710 

FT /*tag- a 

PN W0973B129-A1,' 

PD 16-OCT-1997. 

PF 10-APR-1997; U05929. 

PR 10-APR-1996; 0S-631319. 

PA (MIT0-) MITOTIX INC, 

PA (OYJO ) CNIV JOHNS HOPKINS. 

PI Berlin V, Damagnez V, Smith SE; 

DR WPI; 97-512735/47. 
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II llllllll lllll MM Mil III I I I || || Mill || I 

675 CAMCCAGGGAGAGGAACTGMGAAACTGATTGGATCTGCTGTCTACATTGAATGTAGTT 731 

487 ccaaaacccaacagaatgtgaaggctgttttcgatgctgcaataaaagtagctttgaggc 546 

I II II IMIMI I'llMIM II II HIM II llllllll I I II 
735 CAAAGACACAGCAGAACGTGAAGGCAGTGTTTGATGCAGCTATAAAAGTGGTGCTTCAGC 794 

547 caccaaaaccaaagagaaagccttgcaaaaggagaacatgtgctttcctttga 599 

llillll I MM III II I II II M III I III 
795 CACCAAAGCAGAAGAAGAAGAAAAAGAATAAGAACCGITGCGCGTTCTTGTGA 847 



Search completed: September 3, 2000, 04:03:47 
Job time: 35777 sec 
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SOURCE 
ORGANISM 



Db 826 ATCCCTGAGCTCAGACACTATGCGCCATCGGTACCCATCATTCTCGTTGGGACGAAGCTA 885 
Oy 369 gatttgcgagatgacaagcagttcctcattgatcaccctggagcaacaccaatatcaaca 428 

iii i iiiiiiii ii nun i iii ii mum i in mi 

Db 886 GATCTTCGAGATGATAAACAGTTCTTTGCTGACCATCCTGGAGCGGCTCCAATTACAACC 945 

Qy 429 tctcagggagaagaactaaagaagatgataggagcagttacttatatagaatgcagctcc 488 

Mill II II II II I Nil II Mill I I I lllll II II Mill 
Db 946 TCTCAAGGCGAGGAGCTCAGGAAGTCAATTGGAGCGGCTTCGTATATT6A6T6TAGCTC6 1005 

Qy 489 aaaacccaacagaatgtgaaggctgttttcgatgctgcaataaaagtagctttgaggcca 548 

II II llllllllllllll II lllll lllll lllll INI I I I MM 

Db 1006 AAGACTCAACAGAATGTGAAAGCAGTTTTTGATGCAGCAATCAAGGTGGTTCTICAGCCA 1065 

Qy 549 ccaaaaccaaagagaaag 566 

II II I Mil III 
Db 1066 CCCAAGCAGAAGAAGAAG 1083 



RESULT 13 
AF115476 

LOCUS AF115476 1558 bp mRNA PLN 20-APR-1999 

DEFINITION Physcomitrella patens rac-like GTP binding protein (rac2) mRNA, 

complete cds, 

ACCESSION AF115476 

VERSION AF115476.1 61:4588757 



Physcomitrella patens, 
Physcomitrella patens 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Bryopsida; 
Bryidae; Punariales; Funariaceae; Physcomitrella, 

1 (bases 1 to 1558) 

Hinge, P., Kristensen,R., Bones /A.M. and Reski , R. 
The Physcomitrella patens rac-gene family 
Unpublished 

2 (bases 1 to 1558) 

Kristensen,R,, Winge,P,, Bones, A, M, and Reski,R. 
Direct Submission 

Submitted (18-DEC-1998) The Norwegian University of Science and 
Technology, UNIGEN MTFS, Olav Kyrresgate 3, Trondheim N7005, Norway 

Location/Qualifiers 

1. .1558 

/organism- ' Physcomitrella patens " 

/db_xref-"taxon:3218" 

1. .1558 

/gene-"rac2" 

1. .530 

/gene-"rac2" 

531. ,1121 

/gene-"rac2" 

/note-'PhRac2" 

/codon_start-l 

/product-'rac-like GTP binding protein" 

/protein_id-'AAD26198.1" 

/dbjtref-"GIi4588758" 

/translation-'MSTSRFIRCVTVGDGAVGKTCMLISYTSNTFPTDYVPTVFDNFS 
ANVVVDGNTVNLGLWDTAGQEDYNRLRPLSYRGADVFLLAFSLISKASyENISKKWIP 
ELRHYAPSVPIILVGTKLDLRDDKQFFADHPGAAPITTSQGEELRKSIGAASYIECSS 
KTQQNVKAVFDAAIKWLQPPKQKRKKKKQKNCVIL " 
1122. .1558 
/gene-"rac2" 
368 a 309 c 427 g 454 t 



TITLE 
JOURNAL 



AUTHORS 

TITLE 

JOURNAL 

FEATURES 

source 



5'UTR 
CDS 



BASE COUNT 
ORIGIN 



Query Match 37.9%; Score 344.8; DBS; Length 1558; 

Best Local Similarity 72.7*; Pred. No. 5.6e-61; 

Matches 445; Conservative 0; Mismatches 167; Indels 0; Gaps 

Qy 5 aacaatgagcactgcaagatttatcaagtgtgtcacggtcggtgatggagctgtggggaa 64 

i i nmiMi ii i iiiiiiiiiii ii ii ii ii minim iiiii 

Db 527 AGCCATGAGCACTTCACGGTTTATCAAGTGCGTGACTGTTGGAGATGGAGCTGTCGGGAA 586 



Qy 65 aacttgtatgctcatttcatataccagcaatactttcccaacggattatgttccaacagt 124 

II II Mill IMIIMI IMMIII II II II II lllll Mill M M 

Db 587 GACGTGCATGCTTATTTCATACACCAGCAACACATTTCCTACTGATTACGTTCCTACCGT 646 

Qy 125 atttgataactttagtgccaatgtggtggtggatggcagcacagtgaaccttggcctatg 184 

lllll Mill II II lllll Mill lllll I II II lllll II MM 
Db 647 GTTTGACAACTTCAGCGCAAATGTAGTGGTCGATGGAAATACCGTCAACCTCGGGTTATG 706 

Qy 185 ggacactgccgggcaagaagattataataggctaaggccactgagttatagaggagctga 244 

III II II II IIIIIIIIIII II lllll I II IIIIIIII II II Mill 

Db 707 GGATACAGCAGGTCAAGAAGATTACAACAGGCTTCGTCCTCTGAGTTACAGGGGTGCTGA 766 

Qy 245 tgtgtttttgttggccttttctcttataagcaaggccagttatgaaaacatctacaaaaa 304 

III III I llll II II II II IIIIIIII llllllllllllll I II II 
Db 767 TGTTTTTCTCCTGGCGTTCTCCCTCATCAGCAAGGCTAGTTATGAAAACATATCAAAGAA 826 

Qy 305 gtggatcccagagctaagacattatgctcataatgtaccagttgtgcttgttggaaccaa 364 

IIIIIIII II II IIIIIIII II I Ml III I I Mil lllll II 
Db 827 GTGGATCCCGGAACTGAGACATTACGCGCCATCTGTGCCAATCATTCTCGTCGGAACAAA 886 

Qy 365 actagatttgcgagatgacaagcagttcctcattgatcaccctggagcaacaccaatatc 424 

III Ml I II IIIIIIII II III I MUM MMIMI I IMMI I 
Db 887 ACTTGATCTTCGCGATGACAMCMTTCTTTGCTGATCATCCTGGAGCGGCTCCAATAAC 946 

Qy 425 aacatctcagggagaagaactaaagaagatgataggagcagttacttatatagaatgcag 484 

II Mill II M II II I llll III lllll I II Mill lllll 
Db 947 TACTTCTCAAGGGGAGGAGCTCAGGAAGTCGATTGGGGCGGCCTCGTACATAGAGTGCAG 1006 

Qy 485 ctccaaaacccaacagaatgtgaaggctgttttcgatgctgcaataaaagtagctttgag 544 

Ml II II II IIIIIIII II II Mill M II lllll II II I I I 
Db 1007 CTCAAAGACTCAGCAGAATGTAAAAGCAGTTTTTGACGCAGCAATCAAGGTGGTTCTCCA 1066 

Qy 545 gccaccaaaaccaaagagaaagccttgcaaaaggagaacatgtgctttcctttgaatatt 604 

IMMIII I llll III III I I II I I II lllll I 
Db 1067 ACCACCAAAGCAGAAGAAGAAGAAGAAAAAACAAAAGAATTGCGTCATTCTGTGAATGTG 1126 

Qy 605 ggatcattatta 616 

I ii I ill * 

Db 1127 GCATAGCTTTTA 1138 



RESULT 14 : 
NTA250174 

LOCUS NTA250174 803 bp mRNA PLN 06-OCT-1999 

DEFINITION Nicotiana tabacum mRNA for putative rac protein (rac gene). 
AJ250174 

AJ250174.1 61:6015626 
rac gene; rac protein, 
common tobacco. 
Nicotiana tabacum 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; 
euphyllophytes; Spermatophyta; Magnoliophyta; eudicotyledons; core 
eudicots; Asteridae; euasterids I; Solanales; Solanaceae; 
Nicotiana. 

1 (bases 1 to 803) 

Kieffer,F,, Elmayan,!,, Simon -Plas,F,, Dagher,M.C, and Blein,j.P. 
A tobacco cDNA encoding a Rac-like protein cloned using the 
two-hybrid system in an heterologous screen 
Unpublished 

2 (bases 1 to 803) 
Elmayan,T, 
Direct Submission 

Submitted (05-OCT-1999) Elmayan T., UMR 692 INRA/Universite de 
Bourgogne, INRA / laboratoire de Phytopharmacie, BV 1540, 21034 
Dijon Cedex, FRANCE 

Location/Qualifiers 
1. .803 

/organism-'Nicotiana tabacum" , 
/cultivar-'Xanthi" 
/db_xref-"taxon:4097" 
/dev_stage-"55 day-old plants" 
/tissue_type-"young leaves: the nearest from apex" 
/clone«"Rac5" 



VERSION 



SOURCE 
ORGANISM 



AUTHORS 
TITLE 



AUTHORS 

TITLE 

JOURNAL 



FEATURES 
source 
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Qy 121 cagtatttgataactttagtgccaatgtggtggtggatggcagcacagtgaaccttggcc 180 

I I! IMMMIIII II II Mill II II Mill I III II llllllll I 
Db 353 CCGTTTTTGATAACTTCAGCGCGAATGTCGTTGTCGATGGAAACACTGTAAACCTTGGAC 412 

Qy 181 tatgggacactgccgggcaagaagattataataggctaaggccactgagttatagaggag 240 

mini ii ii ii ii iiiiiiiiiii inn i n imini n n i 

Db 413 TATGGGATACAGCAGGTCAGGAAGATTATAACAGGCTTCGACCTCTGAGTTACAGGGGTG 472 

Qy 241 ctgatgtgtttttgttggccttttctcttataagcaaggccagttatgaaaacatctaca 300 

MUNI Ml I I II II llllllll Mill II llllllll Illllll I 
Db 473 CTGATGTTTTTCTACTAGCATTCTCTCTTATCAGCAAAGCTAGTTATGAGAACATCTCAA 532 

Qy 301 aaaagtggatcccagagctaagacattatgctcataatgtaccagttgtgcttgttggaa 360 

I llllllll II II II lllll II MM llllll II I II II III 
Db 533 AGAAGTGGATTCCTGAACTGAGACACTACGCTCCATCTGTACCTATTATCCTCGTCGGAA 592 

Qy 361 ccaaactagatttgcgagatgacaagcagttcctcattgatcaccctggagcaacaccaa 420 

i ii mill i [illinium n n n n imini i n i 

Db 593 CGAAGCTAGATCTTCGAGATGACAAGCAATTTTTCGCCGACCATCCTGGAGCGGCTCCGA 652 
Qy 421 tatcaacatctcagggagaagaactaaagaagatgataggagcagttacttatatagaat 480 

i mi ii ii ii ilium i iii m mini mi n n i 

Db 653 TCACAACCTCCCAAGGTGAAGAACTCAGAAAGGCGATIGGAGCAGCCTCTTACATTGAGI 712 

Qy 481 gcagctccaaaacccaacagaatgtgaaggctgttttcgatgctgcaataaaagtagctt 540 

I mil n ii ii Minimi n inn mini n nil i i 

Db 713 GTAGCTCTAAGACTCAGCAGAATGTGAAAGCAGTTTTTGATGCTGCCATCAAGGTTGTTC 772 

Qy 541 tgaggccaccaaaaccaaagagaaagccttgcaaaaggagaacatgtgctttcctttgaa 600 

I llllll II I llll Ill III I I Mil I I II llll 
Db 773 TTCAGCCACCCAAGCAGAAGAAGAAGAAGAAAAAACAAAAGAACTGTGTTATTCTCTGAA 832 



RESULT 10 

ATU64919 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 
ORGANISM 



TITLE 
JOURNAL 
REFERENCE 



TITLE 
JOURNAL 



BASE COUNT 
ORIGIN 



ATU64919 1008 bp mRNA PLN 05-JAN-1999 

Arabidopsis thaliana geranylgeranylated protein ATGP2 mRNA, 
complete cds. 
U64919 

064919.1 GI:4097562 

thale cress. 
Arabidopsis thaliana 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; 
euphyllophytes; Spermatophyta; Magnoliophyta; eudicotyledons; core 
eudicots; Rosidae; eurosids II; Brassicales; Brassicaceae; 
Arabidopsis. 

1 (bases 1 to 1008) 

Biermann,B.J., Price, J. R., Crowell,D.N. and Randall, S.K. 
A collection of cDNAs encoding isoprenylated plant proteins 
Unpublished 

2 (bases 1 to 1008) 

Biermann,B.J., Price, J. R., Crowell,D.N. and Randall, S.K. 
Direct Submission 

Submitted (23-JUL-1996) Biology, IffPUI, 723 West Michigan Street, 
Indianapolis, IN 46202-5132, USA 

Location/Qualifiers 

1, .1008 

/organism- " Axabidops is thaliana" 

/dbjcref-"taxon:3702" 

223. .816 

/note-"similar to RholPs; geranylgeranylated protein" 

/codonjtart-1 

/product-"ATGP2" 

/protein_id- n AAD00113.1" 

/dbjtref-"GI : 4097563" 

/translation-"MSASRFIKCVTVGDGAVGKTCLLISYTSNTFPTDYVPTVFDNFS 
ANVWNGATVNLGLWDTAGQEDYNRLRPLSYRGADVFILAFSLISKASYENVSKKWIP 
ELKHYAPGVPIVLVGTKLDLRDDKQFFIDHPGAVPITTAQGEELKKLIGAPAYIECSS 
KTQENVKGVFDAAIRVVLQPPKQKKKKSKAQKACSIL" 
276 a 191 c 221 g 320 t 



Query Match 38.1%; Score 347; DB 8; Length 1008; 

Best Local Similarity 73.5%; Pred. No. 2.1e-61; 

Matches 443; Conservative 0; Mismatches 160; Indels 0; 



Gaps 



Qy 2 aaaaacaatgagcactgcaagatttatcaagtgtgtcacggtcggtgatggagctgtggg 61 

II I Illllll II I II I! II IIIIMIIIII II II II llllllll II 
Db 216 AAGAGAAATGAGCGCTTCGAGGTTCATAAAGTGTGTCACCGTTGGCGACGGAGCTGTTGG 275 

Qy 62 gaaaacttgtatgctcatttcatataccagcaatactttcccaacggattatgttccaac 121 

lllll III llll lllll II llllllll lllll II IIIIIIIIIII II n 
Db 276 TAAAACCTGTTTGCTGATTTCTTACACCAGCAACACTTTTCCTACGGATTATGTACCGAC 335 

Qy 122 agtatttgataactttagtgccaatgtggtggtggatggcagcacagtgaaccttggcct 181 

ii ii iiniiimi ii iiiiiiii ii mi iii iiiii n urn 

Db 336 TGTTTTCGATAACTTTAGCGCAAATGTGGTTGTTAATGGAGCCACTGTGAATCTGGGCCT 395 
Qy 182 atgggacactgccgggcaagaagattataataggctaaggccactgagttatagaggagc 241 

iiiiii ii ii iiiii n immi ii mi ii mini i n n 

Db 396 ATGGGATACCGCAGGGCAGGAGGATTATAACAGATTAAGACCTTTGAGTTACCGCGGTGC 455 
Qy 242 tgatgtgtttttgttggccttttctcttataagcaaggccagttatgaaaacatctacaa 301 

mm ii i ii ii ii iiiiiiii ii nm imim n in in 

Db 456 TGATGTTTTCATCTTAGCATTCTCTCTTATCAGTAAGGCTAGTTATGAGAATGTCTCCAA 515 

Qy 302 aaagtggatcccagagctaagacattatgctcataatgtaccagttgtgcttgttggaac 361 

1 1 1 1 M 1 1 1 1 1 1 1 1 M I I llllllll I I III II I II IIIIIIIIIII 
Db 516 GAAGTGGATCCCAGAGCTGAAGCATTATGCCCCTGGTGTCCCTATAGTTCTTGTTGGAAC 575 

Qy 362 caaactagatttgcgagatgacaagcagttcctcattgatcaccctggagcaacaccaat 421 

1 1 1 1 1 1 1 1 1 1 i ii iiiiiiii iiiiii iiiim iiiiiiii ii mm 

Db 576 CAAACTAGATCTTCGGGATGACAAACAGTTCTTCATTGACCACCCTGGCGCTGTACCAAT 635 

Qy 422 atcaacatctcagggagaagaactaaagaagatgataggagcagttacttatatagaatg 481 

I II minim lllll lllll I II lllll I II II II II 

Db 636 TACTACTGCTCAGGGAGAGGAACTGAAGAAACTAATTGGAGCTCCCGCATACATCGAGTG 695 
Qy 482 cagctccaaaacccaacagaatgtgaaggctgttttcgatgctgcaataaaagtagcttt 541 

iii ii inn iii mi nm i n n inn n n m i i-'r 

Db 696 CAGTTCAAAAACACAAGAGAACGTGAAAGGAGTATTTGATGCAGCGATCCGAGTGGTTCT 755 

Qy 542 gaggccaccaaaaccaaagagaaagccttgcaaaaggagaacatgtgctttcctttgaat 601 

II lllll I llll llll lllll II I II III 
Db 756 TCAACCTCCAAAGCAGAAGAAAAAGAAAAGCAAAGCACAAAAAGCCTGCTCCATTTTGTA 815 

Qy 602 att 604 

III 

Db 816 ATT 818 



RESULT 11 

ATU41295 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 



TITLE 

JOURNAL 
MEDLINE 
REFERENCE 
AUTHORS 



ATU41295 1191 bp mRNA PLN 28-OCM997 

Arabidopsis thaliana GTP binding protein (ARAC1) mRNA, complete 
cds. 
U41295 

U41295.1 GI:1292907 

thale cress, 
Arabidopsis thaliana 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; 
euphyllophytes; Spermatophyta; Magnoliophyta; eudicotyledons; core 
eudicots; Rosidae; eurosids II; Brassicales; Brassicaceae; 
Arabidopsis. 

1 (bases 1 to 1191) 

Winge,P., Brembu,T. and Bones, A.M. 

Cloning and characterization of rac-like cDNAs from Arabidopsis 
thaliana 

Plant Mol. Biol. 35 (4), 483-495 (1997) 
98009984 

2 (bases 1 to 1191) 
Winge,P., Brembu,T. and Bones, A. 
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Query Match 38.4%; 
Best Local Similarity 75.5%; 
Matches 434; Conservative 



Score 349,4; DB 7; Length 950; 

Pred. No. 6.8e-62; 

); Mismatches 141; Indels 0; 



Qy 1 aaaaaacaatgagcactgcaagatttatcaagtgtgtcacggtcggtgatggagctgtgg 60 

. nun nun ii mi iiiiimimii inn iinini n n i 

Db 36 AAAAAAAAATGAGTGCTCCAAGGTTTATCAAGTGTGTTACGGTTGGTGATGGTGCCGTTG 95 
Qy 61 ggaaaacttgtatgctcatttcatataccagcaatactttcccaacggattatgttccaa 120 

Minim i i ilium ininminnni i miiim n i 

Db 96 GGAAAACTTGTCTTTTGATTTCATACACCAGCAATACTTTCCCTATGGATTATGTGCCCA 155 

Qy 121 cagtatttgataactttagtgccaatgtggtggtggatggcagcacagtgaaccttggcc 180 

i ii iiiii ii ii inn inn ii ii im iiiii ii inn 11 

Db 156 CTGTGTTTGACAATTTCAGTGCAAATGTTGTTGTCAATGGGAGCACTGTCAACCTAGGGT 215 
Qy 181 tatgggacactgccgggcaagaagattataataggctaaggccactgagttatagaggag 240 

i iiiii i ii 1 1 m i ii ii inn nun mi ii 1 1 1 1 1 1 1 1 i mi 

Db 216 TGTGGGATACTGCCGGACAGGAGGATTACAATAGGTTAAGACCTCTGAGTTACCGTGGAG 275 

Qy 241 ctgatgtgtttttgttggccttttctcttataagcaaggccagttatgaaaacatctaca 300 

I IIIII II I IIIII II IIIII II II II IIIII IIIII II I I II 
Db 276 CCGATGTCTTCATTTTGGCATTCTCTCTCATTAGTAAAGCCAGCTATGAGAATGTATCCA 335 

Qy 301 aaaagtggatcccagagctaagacattatgctcataatgtaccagttgtgcttgttggaa 360 

I illllll II III I I II lllllll I III III I II llllllllll 
Db 336 AGAAGTGGATTCCTGAGTTGAAGCACTATGCTCCTGGTGTCCCAATAGTTCTTGTTGGAA 395 

Qy 361 ccaaactagatttgcgagatgacaagcagttcctcattgatcaccctggagcaacaccaa 420 

Db 396 CAAAACTTGATCTTCGGG^ 455 

Qy 421 tatcaacatctcagggagaagaactaaagaagatgataggagcagttacttatatagaat 480 

i i ii iiiiiii ii ii mi iii i mn m i mi ii mi 

Db 456 TTACTACTGCTCAGGGTGAGGAGCTAAGGAAAACTATAGGTGCACCTGCTTACATCGAAT 515 

Qy 481 gcagctccaaaacccaacagaatgtgaaggctgttttcgatgctgcaataaaagtagctt 540 

I II II IIIII 11111111111111 II II II IIIII II II II II I 
Db 516 GTAGTTCAAAAACACAACAGAATGTGAAAGCAGTCTTTGATGCAGCCATTAAGGTCGTCC 575 

Qy 541 tgaggccaccaaaaccaaagagaaagccttgcaaa 575 

i iii ii iii mm im i m 

Db 576 TCCAGCCGCCTAAAACAAAGAAAAAGAAGGGGAAA 610 



RESULT 7 
ATU49972 

LOCUS ATU49972 . 935 bp DNA PLN 19-NOV-1998 

DEFINITION Arabidopsis thaliana GTP binding protein Rop2At (Rop2At) mRNA, 

complete cds. 

ACCESSION 049972 

VERSION 049972.1 GI: 1777763 



thale cress. 
Arabidopsis thaliana 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; 
euphyllophytes; Spermatophyta; Magnoliophyta; eudicotyledons; core 
eudicots; Rosidae; eurosids II; Brassicales; Brassicaceae; 
Arabidopsis. 

1 (bases 1 to 935) 

Li, H. , Wu,G., Ware,D. ( Davis, K.R. and Yang,z. 

Arabidopsis rho-related GTPases: differential gene expression in 

pollen and polar localization in fission yeast 

Plant Physiol. 118 (2), 407-417 (1998) 

98440662 

2 (bases 1 to 935) 

Li, H. , Lin,Y., Ware,D., zhou,D., Davis, K.R., Cramer,C.L. and 
Yang, Z, 

Differential Developmental Expression of Two Arabidopsis Genes 

Encoding Rho Family Small GTPases 

Onpublished 



SOORCE 



AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
REFERENCE 
AUTHORS 

TITLE 



AUTHORS 

TITLE 

JOURNAL 

FEATURES 

source 



BASE COUNT 
ORIGIN 



3 (bases 1 to 935) 
Yang,Z, 

Direct Submission 

Submitted (26-FEB-1996) Plant Biotechnology Center, The Ohio State 
University, 1060 Carmack Road, Columbus, OH 43210, OSA 

Location/Qualifiers 

1. .935 

/organism-'Arabidopsis thaliana" 

/cultivar- "Columbia" 

/db_xref-"taxon:3702" 

66. .653 

/gene-"Rop2At" 

66. .653 

/gene-"Rop2At" 

/note-"Rho family GTPase" 

/codon_start-l 

/product- "GTP binding protein Rop2At" 
/protein_id-"AAC7B391.1" 
/db_xref-"GI: 17777 64" 

/translation-'MASRFIKCVTVGDGAVGKTCMLISYTSNTFPTDYVPTVFDNFSA 
NVWDGNTVNLGLWDTAGQEDYNRLRPLSYRGADVFILAFSLISKASYENIAKKWIPE 
LRHYAPGVPIILVGTKLDLRDDKQFFIDHPGAVPITTNQGEELKKLIGSAVYIECSSK 
TQQNVKAVFDAAIKWLQPPKQKKKKRNKNRCAFL " 
251 a 174 c 216 g 294 t 



Query Match 38,2%; 
Best Local Similarity 71.3%; 
Matches 459; Conservative 



Score 348; DB 8; Length 935; 
Pred. No. 1.3e-61; 
); Mismatches 185; Indels ( 



Qy 7 caatgagcactgcaagatttatcaagtgtgtcacggtcggtgatggagctgtggggaaaa 66 

II II I llll IIIII iiiiiii II Mill IIIII II II II llll 

Db 61 CAGAGATGGCGTCAAGGTTTATTAAGTGTGTGACCGTCGGAGATGGTGCCGTCGGAAAAA 120 

Qy 67 cttgtatgctcatttcatataccagcaatactttcccaacggattatgttccaacagtat 126 

II II I II 1 1 1 1 1 1 1 1 II II 1 1 1 II I II 1 1 1 II II II 1 1 1 II I Illil II .1, 
Db 121 CTTGCATGCTCATTTCTTACACTAGCAATACTTTTCCTACTGATTATGTGCCAACTGTTT 180 

Qy 127 ttgataactttagtgccaatgtggtggtggatggcagcacagtgaaccttggcctatggg 186 

I II IIIII IIIII HUM! II IIIIIII III II II IIIII I llll 
Db 181 TCGACAACTTCAGTGCTAATGTGGTTGTTGATGGCAACACTGTCAATCTTGGATTGTGGG 240 

Qy 187 acactgccgggcaagaagattataataggctaaggccactgagttatagaggagctgatg 246 

I IIIII II IIIII II II II III II I II IIIIIII I II IIIIIII 
Db 241 ATACTGCTGGTCAAGAGGACTACAACAGGTTACGACCTTTGAGTTACCGTGGTGCTGATG 300 

Qy 247 tgtttttgttggccttttctcttataagcaaggccagttatgaaaacatctacaaaaagt 306 

I II I I II II IIIIIII llllllll II IIIII II II III llll 
Db 301 TTTTCATTCTTGCTTTCTCTCTTATTAGCAAGGCTAGCTATGAGAATATAGCCAAGAAGT 360 

Qy 307 ggatcccagagctaagacattatgctcataatgtaccagttgtgcttgttggaaccaaac 366 

llll II IIIII II llllllllll I III II II I llllllll II llll 
Db 361 GGATTCCTGAGCTCAGGCATTATGCTCCTGGTGTTCCCATTATCCTTGTTGGGACAAAAC 420 

Qy 367 tagatttgcgagatgacaagcagttcctcattgatcaccctggagcaacaccaatatcaa 426 

I III I lllllllllllll III I II IIIII IIIII II IIIII I I 

Db 421 TCGATCTTCGAGATGACAAGCAATTCTTTATAGATCATCCTGGTGCTGTGCCAATTACTA 480 

Qy 427 catctcagggagaagaactaaagaagatgataggagcagttacttatatagaatgcagct 486 

II llllllll IIIII IIIII llll III I I I II II IIIII II I 

Db 481 CAAACCAGGGAGAGGAACTGAAGAAACTGATTGGATCTGCTGTCTACATTGAATGTAGTT 540 

Qy 487 ccaaaacccaacagaatgtgaaggctgttttcgatgctgcaataaaagtagctttgaggc 546 

I II II II IIIII llllllll II II IIIII II llllllll I I II 
Db 541 CAAAGACACAGCAGAACGTGAAGGCAGTGTTTGATGCAGCTATAAAAGTGGTGCTTCAGC 600 

Qy 547 caccaaaaccaaagagaaagccttgcaaaaggagaacatgtgctttcctttgaatattgg 606 

IIIIIII I llll III II I II II II III I III I 
Db 601 CACCAAAGCAGAAGAAGAAGAAAAAGAATAAGAACCGTTGCGCGTTCTTGTGATAAGAAA 660 

Qy 607 atcattattacagtcaaaaacagttaacaaaagctgttgcagat 650 

I I II II III II III II II! I I III 
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I! MIIIMIIIIIIIIIIMIIIIIMIlim II II lllllllllllllllll 
Db 81 AAAGACTTGTATGCTCATTTCATATACCAGCAATACGTTTCCTACGGATTATGTTCCAAC 140 

Qy 122 agtatttgataactttagtgccaatgtggtggtggatggcagcacagtgaaccttggcct 181 

iii ii ii mi ii ii milium n n n n mimiiimi 

Db 141 AGTTTTCGACAACITCAGCGCAAATGTGGTGGTCGACGGGAGTACCGTGAACCTTGGCCT 200 
Qy 182 atgggacactgccgggcaagaagattataataggctaaggccactgagttatagaggagc 241 

inn nun! ii iiiiiiiiiiiiiiiii inn inn immi 

Db 201 GTGGGATACTGCCGGTCAGGAAGATTATAATAGGCTTAGGCCTTTGAGTTACAGAGGAGC 260 
Qy 242 tgatgtgtttttgttggccttttctcttataagcaaggccagttatgaaaacatctacaa 301 

inn ii ii m ii inn iimmiiiiiimm n n n mi - 

Db 261 AGATGTCTTCTTATTAGCATTTTCCCTTATAAGCAAGGCCAGTTACGAGAATATTCACAA 320 
Qy 302 aaagtggatcccagagctaagacattatgctcataatgtaccagttgtgcttgttggaac 361 

inn i ii inn i minimi i i i mini i inn 

Db 321 AAAGTGGCTTCCGGAGCTGAAACATTATGCTCCTGGCATCCCCATTGTGCTCGTCGGAAC 380 

Qy 362 caaactagatttgcgagatgacaagcagttcctcattgatcaccctggagcaacaccaat 421 

III IIIIMII I 1 1 1 1 1 M 1 1 1 1 1 1 1 1 I I lllll II llllll I I II 
Db 381 AAAATTAGATTTGAGGGATGACAAGCAGTTCTTGAAGGATCATCCAGGAGCAGCTTCTAT 440 

Qy 422 atcaacatctcagggagaagaactaaagaagatgataggagcagttacttatatagaatg 481 

i mi iiiimmim in m inn inn in n mi n 

Db 441 AACAACTGCTCAGGGAGAAGAATTAAGGAAAATGATTGGAGCTGTTAGGTACTTAGAGTG 500 

Qy 482 cagctccaaaacccaacagaatgtgaaggctgttttcgatgctgcaataaaagtagcttt 541 

Ilimilllllllllllimilllllll II II III I II INI Mill! 
Db 501 CAGCTCCAAAACCCAACAGAATGTGAAGGCAGTGTTTGATACAGCGATAAGGGTAGCTTT 560 

Qy 542 gaggccaccaaaaccaaagagaaag 566 

1 1 1 1 1 1 1 1 1 1 1 1 mm mi 

Db 561 GAGGCCACCAAAGGCAAAGAAAAAG 585 



RESULT 4 

LJRAC2 

LOCOS 

DEFINITION 
ACCESSION 
VERSION 
KEYWORDS 
SOURCE 
ORGANISM 



AUTHORS 
TITLE 



AUTHORS 

TITLE 

JOURNAL 



JOURNAL 
MEDLINE 
FEATURES 
source 



LJRAC2 982 bp mRNA PLN 12-MAY-1997 

L.japonicus mRNA for small CTP-binding protein, RAC2. . 
Z73962 

Z73962.1 61:1370200 

rac2 gene; small GTP-binding protein. 

Lotus japonicus. 

Lotus japonicus 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; 
euphyllophytes; Spermatophyta; Magnoliophyta; eudicotyledons; 
Rosidae; Fabales; Fabaceae; Papilionoideae; Lotus, 

1 (bases 1 to 982) 

Borg,S., Brandstrup,B. and Poulsen,C. 

Structural analysis of cDNAs encoding 33 different small GTP 

binding proteins from Lotus japonicus and expression of 

corresponding mRNAs in developing root nodules 

Unpublished 

2 (bases 1 to 982) 
Poulsen,C. 

Direct Submission 

Submitted (H-MAY-1996) C. Poulsen, University Of Aarhus, Dept of 
Molecular and Structural Biology, Gustav wieds Vej 10C, DK-8000 
Aarhus C, DENMARK 

3 (bases 1 to 982) 

Borg,S., Brandstrup,B., Jensen, T.J, and Poulsen, c. 
Identification of new protein species among 33 different small 
GTP-binding proteins encoded by cDNAs from Lotus japonicus, and 
expression of corresponding mRNAs in developing root nodules 
Plant J. 11 (2), 237-250 (1997) 
97231679 

Location/Qualifiers 
1. .982 

/organism- "Lotus japonicus" 
/variety-'Gifu B-129" 
/dbjcref-"taxon:343Q5" 
/tissue_type-"root nodules" 



BASE COUNT 
ORIGIN 



/dev_stage-"21 dpi with Rhizobium loti NZP 2037" 

/clonejib-" lambda ZAPII (Stratagene)" 

113. .703 

/gene-"rac2" 

113. .703 

/gene-"rac2" 

/function- "GTP -binding protein" 

/codon_start-l 

/product- "RAC2" 

/protein_id-"CAA98190.1" 

/dbjcref-"CI: 1370201" 

/db_xref-"SWISS-PROT:Q40220" 

/translation-'MSTARFIKCVTVGDGAVGKTCMLISYTSNTFPTDYVPTVFDNFS 
ANVVVDGSTVNLGLWDTAGQEDYNRLRPLSYRGADVFLLAFSLLSRASYENISKKWIP 
ELRHYAPTVPIVLVGTKLDLREDRQYLIDHPGATPITTAQGEELKKAI6AAVYLECSS 
KTQQNVKAVFDAAI KWLQPPKPKKKRKKTRPCVFL " 
290 a 173 c 211 g 308 t 



Query Match 43,6%; Score 396.8; DB 7; Length 982; 

Best Local Similarity 78.8%; Pred. No. 1.4e-71; 

Matches 473; Conservative ' 0; Mismatches 127; Indels 0; Gaps 

Qy 6 acaatgagcactgcaagatttatcaagtgtgtcacggtcggtgatggagctgtggggaaa 65 

i immii ii inn minium n n n immi inn n 

Db 110 AAAATGAGCACAGCTAGATTCATCAAGTGTGTTACTGTTGGAGATGGAGCAGTGGGAAAG 169 

Qy 66 acttgtatgctcatttcatataccagcaatactttcccaacggattatgttccaacagta 125 ■ 

II llllllll II II II IMMII II lllllllllllllllll II II II 
Db 170 ACCTGTATGCTTATCTCTTACACCAGCAACACATTCCCAACGGATTATGTGCCTACTGTT 229 

Qy 126 tttgataactttagtgccaatgtggtggtggatggcagcacagtgaaccttggcctatgg 185 

iiiiimm mn mimiiimi miiimiiiii mm ii inn 

Db 230 TTTGATAACTTCAGTGCAAATGTGGTGGTTGATGGCAGCACAGTTAACCTGGGATTATGG 289 

Qy 186 gacactgccgggcaagaagattataataggctaaggccactgagttatagaggagctgat 245 

llllllll II II II lllll IMMII lllll MM II MMM Ml 
Db 290 GACACTGCTGGACA6GAGGATTACAATAGGCTTAGGCCTTTGAGCTACAGAGGAGCAGAT, 349 

Qy 246 gtgtttttgttggccttttctcttataagcaaggccagttatgaaaacatctacaaaaag 305 

Db 350 GTGTTCTTGCTGGCTTTTTCCCTCCTTAGC^ 409 

Qy 306 tggatcccagagctaagacattatgctcataatgtaccagttgtgcttgttggaaccaaa 365 

mn ii ii ii iiiii iiiii i i iii iii mi mn iinnm 

Db 410 TGGATTCCTGAACTGAGACACTATGCCCCAACTGTGCCAATTGTTCTTGTGGGAACCAAA 459 

Qy 366 ctagatttgcgagatgacaagcagttcctcattgatcaccctggagcaacaccaatatca 425 

1 1 1 1 II 1 1 I 1 1 III I I III I 11 1 1 1 III I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 
Db 470 CTTGATTTGAGGGAAGACAGGCAGTATTTGATTGATCATCCTGGAGCCACACCTATTACT 529 

Qy 426 acatctcagggagaagaactaaagaagatgataggagcagttacttatatagaatgcagc 485 

ii i 1 1 1 1 1 1 1 1 1 1 1 ii mm ii iMi ii minimi 

Db 530 ACTGCCCAGGGAGAAGAGCTGAAGAAGGCAATTGGTGCTGCTGTGTACCTAGAATGCAGC 589 
Qy 486 tccaaaacccaacagaatgtgaaggctgttttcgatgctgcaataaaagtagctttgagg 545 

ii ii ii immmmimm n iiiiini n n n i mi i 

Db 590 TCAAAGACTCAACAGAATGTGAAGGCTGTGTTTGATGCTGCTATCAAGGTTGTTTTGCAG 649 
Qy 546 ccaccaaaaccaaagagaaagccttgcaaaaggagaacatgtgctttcctttgaatattg 605 

mn miiiini in i n i m mi i immi 1 1 n 

Db 650 CCACCTAAACCAAAGAAAAAACGAAAGAAGACCAGACCATGCGTTTTCCTTTAATTGATG 709 



RESULT 5 
ATU49971 

LOCUS ATU49971 843 bp DNA PLN 19-NOV-1998 

DEFINITION Arabidopsis thaliana GTP binding protein RoplAt (RoplAt) mRNA, 
complete cds , 

ACCESSION U49971 

VERSION U49971.1 61:2558665 
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064924 Nicotiana t 
AF031428 Arabidops 
AF051223 Picea mar 
AJ251210 Medicago 

AF218381 Oryza sat 
AB024996 Cicer ari 
U64920 Arabidopsis 

AF233447 Physcomit 
AF126053 Zea mays 
AF126055 Zea nays 
AF079486 Arabidops 

AP165925 Gossypium 
AF156896 Arabidops 
AB029510 Oryza sat 
AF126054 Zea mays 
AF079485 Arabidops 
AB029508 Oryza sat 

AF239751 Tradescan 
AF079484 Arabidops 
AF126052 Zea mays 
AB029509 Oryza sat 



RESULT 1 
S79308 

LOCUS S79308 • 913 bp mRNA PLN 30-NOV-1995 

DEFINITION Racl3-21.8 kda GTP-binding protein [Gossypium hirsutum-cotton 
plants, cv. Acala SJ-2, boll fibers, mRNA Partial, 913 nt], 
S79308 

S79308.1 GI:1087110 



334.4 36 


,7 1008 


8 NTU64924 


333.4 36 


.6 1117 


8 AF031428 


333 36 


.6 1220 


8 AF051223 


332.6 36 


,5 917 


8 MSA251210 


330.8 36 


,4 850 


49 AF218381 


329.8 36 


,2 1059 


7 AB024996 


326 35 


.8 756 


8 ATU64920 


325.8 35 


,8 1081 


49 AF233447 


320.4 35 


.2 1393 


8 AF126053 


317.2 34 


.9 1058 


8 AF126055 


313.8 34 


.5 869 


8 AF079486 


310.6 34 


.1 640 


49 AF165925 


308 33 


,8 771 


8 AF156896 


291.8 32 


.1 1067 


7 AB029510 


290.8 32 


.0 1045 


8 AF126054 


290 31 


.9 794 


8 AF079485 


286 31 


.4 996 


7 AB029508 


285.8 31 


.4 867 


49 AF239751 


283.2 31 


.1 1090 


8 AF079484 


282.2 31 


.0 1127 


8 AF126052 


277.6 30 


.5 1087 


7 AB029509 



ACCESSION 
VERSION 
KEYWORDS 

SOURCE upland cotton boll fibers cv. Acala SJ-2. 
ORGANISM Gossypium hirsutum 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; 
euphyllophytes; Spermatophyta; Magnoliophyta; eudicotyledons; core 
eudicots; Rosidae; eurosids II; Malvales; Malvaceae; Gossypium. 
1 (bases 1 to 913) 

Delmer,D.P., Pear, J. R., Andrawis,A. and Stalker, D.M. 
Genes encoding small GTP-binding proteins analogous to mammalian 
rac are preferentially expressed in developing cotton fibers 
Mol. Gen. Genet. 248 (1), 43-51 (1995) 
95379748 

GenBank staff at the National Library of Medicine created this 
entry [NCBI gibbsq 170155] from the original journal article. 
This sequence comes from Fig. 1A. 
Location/Qualifiers 
1...913 

/organism-'Gossypium hirsutum' 
/dbjtref-"taxon:3635" 
12. .602 
/gene-"Racl3" 

/note- "21. 8 kda GTP-binding protein" 
12. ,602 
/gene-"Racl3" 

/note-"21.8 kda GTP-binding protein; pea Rhol protein 
homolog/mammalian rac protein homolog; ; This sequence 
comes from Fig. 1A" 
/codon_start-l 
v /protein_id-"AAB35093.1" 
/dbjcref-"GI:1087111" 

/translation-'MSTARFIKCVTVGDGAVGKTCMLISYTSNTFPTDYVPTVFDNFS 
ANVVVDGSTVNLGLWDTAGQEDYNRLRPLSYRGADVFLLAFSLISKASYENIYKKWIP 
ELRHYAHNVPVVLVGTKLDLRDDKQFLIDHPGATPISTSQGEELKKMIGAVTYIECSS 
KTQQNVKAVFDAAIKVALRPPKPKRKPCKRRTCAFL" 

BASE COUNT 307 a 169 c 172 g 265 t 

ORIGIN 



AUTHORS 
TITLE 



JOURNAL 
MEDLINE 



FEATURES 
source 



gene 



CDS 



Query Match 100.04; Score 910; DB 8; Length 913; 

Best Local Similarity 100,0%; Pred, No. l,7e-176; 



Matches 910; Conservative 0; Mismatches 



IIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 



IIIMIIIIIMIIIMIMIIIMMIIIIIIIIMIIIIIIIIIIIIIIIIIIIIII! 



iiiiiiiiiiiiMiiiiiiiiiiiiiiiiiiiiiiiimiiimiiiiiiiimii 



iimiiiiiiiiiiiiiiiiiiiiiiiimiiiiimiimiiiiiiiiimiii 



iiiiimiiiiiiiiiiiiimiiMiiiimiiimiiiiiiiiHimiiii! 



iiiiiiiiiiiiiiiiiiMMiiiiiiiiiiiMiiiiiii!! minim mm 



1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 



iiimiimimimiiiiimiiiiiiiiiiiiiiiiiiiiiiiiiiimm 



iiiiiimmiiiiiiiiiiiiiimiiiiiimiiiiiiiiiiiiiiiiiiiiii 



iiiiiiiiimmiiiimiiiiiiiiimiiiiimiiiiiiiiiiiiiiim 



iiiiiiiiiiiimiiiimiimiiiiiiiiiiiimiiiiiiiiiiiiiiiiii 



miimimiiiiiiiiiiiiiiiiimiiiiimiiiiiiiiiiiimiiiii 



iiiiiiiiimmiiiiimiiiiimimiimiimiiiiiiiiiiiiiii 



miimiiimmmiiiimmiiiiiiimiimiiiiiiiiiiimi 



iiiiiiiimmiiimiiiiiimiiiiiiimmiiiimiiiiimiii 



Matches 


Oy 


1 


Db 


4 


Qy 


61 


Db 


64 


Qy 


121 


Db 


124 


Qy 


181 


Db 


. 184 


Qy 


241 


Db 


244 


Qy 


301 


Db 


304 


Qy 


361 


Db 


364 


Qy 


421 


Db 


424 


Qy 


481 


Db 


484 


Qy 


541 


Db 


544 


Qy 


601 


Db 


604 


Qy 


661 


Db 


664 


Qy 


721 


Db 


724 


Qy 


781 


Db 


784 


Qy 


841 


Db 


844 


Qy 


901 


Db 


904 



910 



Milium 



RESULT 2 
S79309 

LOCUS S79309 840 bp mRNA PLN 3Q-NOV-1995 

DEFINITION Rac9-21.5 kda GTP-binding protein [Gossypium hirsutum-cotton 
plants, cv. Acala SJ-2, boll fibers, mRNA Partial, 840 nt], 
ACCESSION S79309 



Tue Sep 5 07:22:54 2000 



us-08-984-099-l.xst 



Page 
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us-08-984 
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JOURNAL Submitted ( 02- JUN-1999 ) Genoscope ■ Centre National de Sequencage : 
bp 191 91006 EVRY cedex ■ FRANCE (E-mail : seqreffigenoscope.cns.fr 
■ Web : www.genoscope.cns.fr) 
COMMENT Determination of this BAC-end sequence was carried out as part of a 
collaboration with the Berkeley Drosophila Genome Project (BDGP), 
The BDGP is constructing a physical map of the Drosophila 
melanogaster genome using these BACs, For further information 
please see http://www.fruitfly.org The BMP Drosophila 
melanogaster BAC library was prepared by Kazutoyo Osoegawa and 
Aaron Mamioser in Pieter de Jong's laboratory in the Department of 
Cancer Genetics at the Roswell Park Cancer Institute in Buffalo, 
NY. The library is named RPCI-98 and was constructed by partial 
EcoRl digestion of Drosophila DNA provided by the BDGP from the 
isogenic strain y2; cn bw sp, the same strain used for the BDGP's 
PI and EST libraries. A more detailed description of the library 
and how to order individual BAC clones, the entire library, or 
filters for hybridization from the BACPAC Resource Center can be 
found at http://bacpac.med.buffalo.edu/drosophilaJbac.htm. 
features Location/Qualifiers 
source 1. .736 

/organism- "Drosophila melanogaster " 
/dbjtref""taxon:7227* 
/clone_lib-"RPCI-9B" 
/clone-"BACRl9D18" 
/note-"end : TET3" 
19 a 62 c 0 g 589 t 66 others 



BASE COUNT 
ORIGIN 



Query Match 8.8»; Score 85.4; DB 122; Length 736; 

Best Local Similarity 46,4*; Pred. no, 9,9e-ll; 

Matches 248; Conservative 1; Mismatches 286; Indels 0; Gaps 

Qy 156 acaattggcttcaaaatacgaaaagcacgaagagtctgaatacaaacagccaaaatatca 215 

I I llh MM I Nil I II I II I III I llll I I 
Db 543 ATATTTGKKAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 484 

Qy 216 tgaagagtacccaaaacatgagaagcctgaaatgtacaaggaggaaaaacaaaaaccctg 275 

II I I llll I I II III I II I Hill Mill 

Db 483 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 424 

Qy 276 caaacatcatgaagagtaccacgagtcacgcgaatcgaaggagcacgaagagtacgataa 335 

III I I II I I I I I II II I I II I I I II 

Db 423 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 364 

Qy 336 agaaaaacccgatttccccaaatgggaaaagcctaaagagcacgagaaacacgaagtcga 395 

I Mill I III llll III I I I III I II I 
Db 363 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 304 

Qy 396 atatccgaaaatacccgagtacaaggacaaacaagatgagaataagaaacataaagatga 455 

I I llll I I I II I III II I I II II III I III I I 

Db 303 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 244 

Qy 456 agagtgccaggagtcacacgaatcgaaagagcacgaagagtacgagaaagaaaaacccga 515 

II I I I Ml III I I II I I I III III I 

Db 243 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 184 

Qy 516 tttccccaaatgggaaaagcctaaagggcacgagaaacataaagccgaatatccgaaaat 575 

III llll III I I III I III II I llll 

Db 183 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 124 

Qy 576 acctgagtgcaaggaaaaactagatgaggataaggaacataaacatgagttcccaaagca 635 

I I II llll I I II II II I III I I III I 
Db 123 AAAAAAAMWAAAATAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAMNM 64 

Qy 636 tgaaaaagaagaggagaagaaacctgagaaaggcatagtaccctgagtgggttaa 690 

inn ii i iii iii i inn iii ii 

Db 63 AAAAAAAAAAAAAAAAAAAAAAADNMAAAAAGGAAANNNAANNNNAANNNANNAA 9 



RESULT 14 
AQ325799/C 
LOCUS AQ325799 



982 bp DNA 



DEFINITION nbxb002lBl4r CUGI Rice BAC Library Oryza sativa genomic clone 

nbxbO021B14r, genomic survey sequence. 
ACCESSION AQ325799 



VERSION 
KEYWORDS 
SOURCE 
ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 
source 



BASE COUNT 
ORIGIN 



AQ325799.1 GI:4117649 
GSS. 

Oryza sativa, 
Oryza sativa 

Eukaryota; Viridiplantae; Embryophyta; Tracheophyta; Spermatophyta; 
Magnoliophyta; Liliopsida; Poales; Poaceae; Oryza, 
1 (bases 1 to 982) 
Wing, R. A. and Dean, R. A. 

A BAC End Sequencing Framework to Sequence the Rice Genome 
Unpublished (1998) 
Contact: Wing RA 

Clemson University Genomics Institute 
Clemson University 

100 Jordan Hall, Clemson, SC 29634, USA 

Tel: 864 656 7288 

Fax: 864 656 4293 

Email: rwing?clemson,edu 

Seq primer: GGAAACAGCTATGACCATG 

Class: BAC ends 

High quality sequence start: 4 

High quality sequence stop: 123. 1, 
Location/Qualifiers 
1. .982 

/organism- "Oryza sativa" 
/strain-'Japonica" 
/cultivar-'Nipponbare" 
/db_xref-"taxon:4530" 
/clone-"nbxb0021B14r" 
/clone_lib-"COGI Rice BAC Library" 
/tissue_type-"Leaf" 
/lab_host-"E. coli DH10B" 

/note-"Vector: pBeloBACll; Site_l: Hindlll; Site_2: 
Hindlll; Rice is one of two most popular grains in the 
world. Half of the world population especially those 
inhabiting highly populated areas of the humid tropics 
and subtropics, rely on rice as their primary source- of 
carbohydrate. Monocotyledonous rice is a diploid plant 
(2n-24) with a haploid genome equivalent of 431 Mbp 
(Arumuganathan and Earle, 1991). The relatively small 
genome of rice, three times larger than that of 
Arabidopsis, makes it suitable for genomic studies. In 
■ order to facilitate positional cloning, physical mapping 
and genome sequencing of rice, we have constructed a BAC 
library from Oryza sativa, Nipponbare variety, The 
library contains 36,864 clones with an average insert size 
of 128,5 Kb providing 10,9 haploid genome equivalents. The 
deep coverage allows the isolation a particular sequence 
with a probability of 99.9 %. Two high density filters, 
each containing 18,432 clones (doubly spotted), represent 
the whole library for colony screening," 
141 a 69 c 43 g 674 t 55 others 



Query Match 8.8*; Score 85.2; DB 101; Length 982; 

Best Local Similarity 45.1%; Pred. No. l.le-10; 

Matches 225; Conservative 0; Mismatches 274; Indels 0; Gaps 

Qy 168 aaaatacgaaaagcacgaagagtctgaatacaaacagccaaaatatcatgaagagtaccc 227 

,. mi i mi i ii i ii i m i i n ii ii ii 

Db 633 AAAAAAAAAAAAAAAAAAAAAAATAAAAAAAAAMATAAMAAAAAAAAAAAAAAAAANN 574 

Qy 228 aaaacatgagaagcctgaaatgtacaaggaggaaaaacaaaaaccctgcaaacatcatga 287 

III I I II III I II I Mill lllll III I I I 
Db 573 NAAAAAAAAAAAAAAAAAAAANNAAAMAAAAAAAAAAAAAAAAAAMAAAAAAAAAAAA 514 

Qy 288 agagtaccacgagtcacgcgaatcgaaggagcacgaagagtacgataaagaaaaacccga 347 

I I I I I I I II I I II I I III llll I 
Db 513 AAAAAMAAAAAAAAAANNNANAAAAAAAAAAAAAAAAAAAAAANNAAAANAAAAAAAAA 454 



Tue Sep 5 07:22:54 2000 



US-08-984-099-l.rst 



Page 



ACCESSION B08337 



VERSION 
KEYWORDS 



TITLE 
JOURNAL 



FEATURES 
source 



BASE COUNT 
ORIGIN 



B08337.1 GI:2089458 
GSS. 

thale cress, 
Arabidopsis thaliana 

Eukaryota; Viridiplantae; Embryophy ta ; Tracheophyta; Sperraatophyta; 
Magnoliophyta; eudicotyledons; Rosidae; eurosids II; Brassicales; 
Brassicaceae; Arabidopsis. 
1 (bases 1 to 1198) 

Feng, J., Dewar,K,, Buehler,E., Kim,C, Li,Y. , Shinn,P., Sun, H. and 
Ecker,J. 

BAC End Sequences at ATGC 
Unpublished (1997) 

Other.GSSs: T19F9-T7.1, T19F9-T7, T19F9-Sp6 
Contact: Ecker J. 

Arabidopsis Thaliana Genome Center 
University of Pennsylvania 

Dept. of Biology, University of Pennsylvania, Philadelphia, PA 
19104 

Tel: 215-898-9384 
Fax: 215-898-8780 

Email: J ecker Satgenome . bio . upenn . edu 
Seq primer: Sp6 
Class: BAC ends 

High quality sequence start: 48 
High quality sequence stop: 540. 

Location/Qualifiers 

1. .1198 

/organism- "Arabidopsis thaliana" 

/strain-'Columbia" 

/dbjcref-"taxon:3702" 

/clone-"T19F9" 

/clone_lib-'TAMU" 

/sex-'hermaphrodite" 

/note-'Vector: BeloBACII; Site_l: Hindlll; Site J: 
Hindlll; Produced by Rod Wing" 
98 a 101 c 108 g 757 t 134 others 



Query Match 9.0%; 
Best Local Similarity 42. 8%; 
Matches 217; Conservative 



Score 86.8; DB 120; Length 
Pred. No. 4.9e-ll; 
0; Mismatches 290; Indels 



1198; 



Qy 169 aaatacgaaaagcacgaagagtctgaatacaaacagccaaaatatcatgaagagtaccca 228 

III I llll I II I II! I Mil I II I I 

Db 1092 fMmm^mm^mmMmmmipmmMmmmmm 1033 

Qy 229 aaacatgagaagcctgaaatgtacaaggaggaaaaacaaaaaccctgcaaacatcatgaa 288 

III I I II I I I II I llll Mill II I I II 
Db 1032 AAAAAAAAAAAAAAAAANMANAAAAAAAAAAAAANNAAAAAAANANNAANAAAAAAAAA 973 

Qy 289 gagtaccacgagtcacgcgaatcgaaggagcacgaagagtacgataaagaaaaacccgat 348 

Ml I II I I II I I I III Mill I 

Db 972 AAAAAAAAAANNAANANANNAAN^AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 913 

Qy 349 ttccccaaatgggaaaagcctaaagagcacgagaaacacgaagtcgaatatccgaaaata 408 

II Mil III I I I III I I II I llll 
Db 912 NNAAAAAANNMAAAAAAAAAAAAAANAAANAAAAAAANMAAAAAAAAAAAAAAANAAA 853 

Qy 409 cccgagtacaaggacaaacaagatgagaataagaaacataaagatgaagagtgccaggag 468 

I I II I III I I I II II II I III I II I II 
Db 852 AAANAAAAAAANAANAAAAANNAAAAAAAAAAAAANAAAAAAAAAAAAAAANAAAAAAAA 793 

Qy 469 tcacacgaatcgaaagagcacgaagagtacgagaaagaaaaacccgatttccccaaatgg 528 • 

I I II III I I II I I I II III I I Ml 

Db 792 AMNAAAAAAAAAAAAAAAAAAAAAAAAMAAAAAMAAMAAAAAANNAAAMAAAAAA 733 

Qy 529 gaaaagcctaaagggcacgagaaacataaagccgaatatccgaaaatacctgagtgcaag 588 

llll III I III I III II I llll I I II 
Db 732 AAAAAAAAAAAAAAANTO1AAAAAAAAAAAAAAAAAAAAAAANAAAMAAAANAAAAAAAA 673 



Qy 589 gaaaaactagatgaggataaggaacataaacatgagttcccaaagcatgaaaaagaagag 648 

Mill I ■ I I II II I III I I llll Mill I I 
Db 672 AAAAAAAAANNAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAANAAAAAAAAAANAAA 613 

Qy 649 gagaagaaacctgagaaaggcatagta 675 

II III Mil III 

Db 612 NNAAANAAAAANAAAAAAAAAAAAANA 586 



RESULT 11 
AQ330286/C 

LOCUS AQ330286 870 bp DNA GSS 08-JAN-1999 

DEFINITION nbxb0046J18r CUGI Rice BAC Library Oryza sativa genomic clone 

nbxb0046J18r, genomic survey sequence. 
ACCESSION AQ330286 



VERSION 
KEYWORDS 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 
source 



AQ330286.1 GI:4122136 
GSS. 

Oryza sativa, 
Oryza sativa 

Eukaryota; Viridiplantae; Embryophyta; Tracheophyta; Spermatophyta; 
Magnoliophyta; Liliopsida; Poales; Poaceae; Oryza. 
1 (bases 1 to 870) 
Wing, R, A. and Dean, R. A. 

A BAC End Sequencing Framework to Sequence the Rice Genome 
Unpublished (1998) 
Contact: Wing RA 

Clemson University Genomics Institute 
Clemson University 

100 Jordan Hall, Clemson, SC 29634, USA 

Tel: 864 656 7288 

Fax: 864 656 4293 

Email: rwing8clemson.edu 

Seq primer: GGAAACAGCTATGACCATG 

Class: BAC ends 

High quality sequence start: 13 
High quality sequence stop: 104, 

Location/Qualifiers -" 

1. .870 

/organism- "Oryza sativa" 
/strain-'Japonica" 
/cultivar-'Nipponbare" 
/dbjtref - " taxon : 4 53 0 " 
/clone-"nbxb0046J18r" 
/clone_lib-"CUGI Rice BAC Library" 
/tissue_type="Leaf" 
/lab_host-"E. coli DH10B" 

/note-"Vector: pBeloBACll; Site.l: Hindlll; Site.2: 
Hindlll; Rice is one of two most popular grains in the 
world. Half of the world population especially those 
inhabiting highly populated areas of the humid tropics 
and subtropics, rely on rice as their primary source of 
carbohydrate, Monocotyledonous rice is a diploid plant 
(2n-24) with a haploid genome equivalent of 431 Mbp 
(Arumuganathan and Earle, 1991) . The relatively small 
genome of rice, three times larger than that of 
Arabidopsis, makes it suitable for genomic studies, In 
order to facilitate positional cloning, physical mapping 
and genome sequencing of rice, we have constructed a BAC 
library from Oryza sativa, Nipponbare variety. The 
library contains 36,864 clones with an average insert size 
of 128.5 Kb providing 10.9 haploid genome equivalents. The 
deep coverage allows the isolation a particular sequence 
with a probability of 99.9 4. Two high density filters, 
each containing 18,432 clones (doubly spotted), 
the whole library for colony screening." 
40 a , 52 c 31 g 674 t 73 others 



Query Match 8.9%; Score 85.8; DB 101; Length 870; 

Best Local Similarity 44.6%; Pred. No. 8.1e-ll; 

Matches 225; Conservative 0; Mismatches 280; Indels 0; 
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Db 322 AAAATAATAAAAAAAAAAAAAAAMTTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 381 

Qy 488 acgaagagtacgagaaagaaaaacccgatttccccaaatgggaaaagcctaaagggcacg 547 

Ml I I I III Mill I III Mil III I 
Ob 382 AAAAANAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAANAAAAAAAAAAAAAAAAAAA 441 

Qy 548 agaaacataaagccgaatatccgaaaatacctgagtgcaaggaaaaactagatgaggata 607 

I III I I I II llll I I II Mill Mill 

Db 442 AAAAAAAAMAAAAAMAAAAAAAAAAAAAAANAAAAAAAAAAAAAAAAAAAAAAAAAAA 501 

Qy 608 aggaacataaacatgagttcccaaagcatgaaaaagaagaggagaagaaacctgagaaag 667 

I II I II I : : I II MM:: :| II :: 

Db 502 AAAAAAAAAAKAKAGKBCDKAABAKTAATTGGKKSGAABABSCBABAAACATRTTSHTBA 561 

Qy 668 gcatagtaccctgagtgggttaaaatgcctgaatggccgaagtccatgtttactcagtct 727 

I I: =1 I I: |::| I : :|:|:: :| ::|: I |:| 
Db 562 ACTTDHKAATTAWATTSATAMWVAATTCABGHTVGSBTSTSTATBSSBTWTTTATATYT 621 

Qy 728 ggctcgagcactaagccttaagccatatgacactggtgcatgtgccatcatcatgcagta 787 

:| I II : :IM :hl I: II I I II:: I ' 

Db 622 CBCACTTTTTTACTATCTSMYTSCATTCTCYASTYASGVSSATGTCKSTBTAATKVCMVA 681 

Qy 788 atttcatgggatattgtaattatat-tgttaataaaaaagatggtgagtgggaaatgtgt 846 

:l I: llll ::IH I I llll : I ||: :|| : :| |: I I 
Db 682 WTGVCTYTTTATTTAATTWWTATTTBTWTTAAAWTTTATGAKTKTGTVGVGTTSArTTTT 741 

Qy 847 gtgtgcattcatccatgagca-atgctgaatctctttgcatgcatagagattctgaatgg 905 

:: II :l II :|:| |: : l|:: I |: |: I :|| II : 
Db 742 ACTCSBTTTTVTACAMCWGYATAWTSATAAWVSCATWTTTTSTTTGATTSCKCTCAAAAS 801 

Qy 906 ttatagtttatgttatatcgtttgttctagtgaaattaattttgaatgttgtatgtaatg 965 

II I :M II I : II :| II I : II Mill I |: ||: :||: 

Db 802 TTTTYSTWTTTGAAAAWGCGAYTTTTTTTTWTTTTTTTATTTTTTTTTTWTTAWCKAABW 861 

Qy 966 tt 967 

II 

Db 862 TT 863 



RESULT 7 
AQ782441 

LOCUS AQ782441 693 bp DNA GSS 02 -AUG -1999 

DEFINITION HS_3174_A2_B03JR CIT Approved Human Genomic Sperm Library D 

Homo sapiens genomic clone Plate-3174 Col-6 Row-C, genomic 

survey sequence. 
ACCESSION AQ782441 

VERSION AQ782441.1 61:5685401 ' 



TITLE 



JOURNAL 



)ORCE ' human. 
ORGANISM Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; vertebrata; Euteleostomi; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
TERENCE 1 (bases 1 to 693) 

AUTHORS Mahairas,G.G., Wallace, J.C., Smith, K., Swartzell,S., Holzman,T. ( 

Keller, A. , Shaker, R., Furlong, J,, Young, J., Zhao,S., Adams, M.D. and 
Hood,L. 

Sequence -tagged connectors: A sequence approach to mapping and 
scanning the human genome 

Proc. Natl. Acad. Sci. U. S. A. 96 (17), 9739-9744 (1999) 
99380589 

On Mar 23, 1999 this sequence version replaced gi: 3324197. 
Contact; Mahairas GG, Wallace JC, Hood L 
High Throughput Sequencing Center 
University of Washington 

401 Queen Anne Avenue North, Seattle, WA 98109, USA 
Tel: (206) 616-3618 
Fax: (206) 616-3887 
Email: jwallace9u.washington.edu 

Clones may be purchased from Research Genetics (info@resgen.com) . 
BAC end Web Server: http://www.htsc.washington.edu 
Plate: 3174 row: C column: 6 
Seq primer: Ml 3 Reverse 



COMMENT 



FEATURES 
source 



BASE COUNT 
ORIGIN 



Class: BAC ends 

High quality sequence stop: 693. 
Location/Qualifiers 
1. .693 

/organism-'Homo sapiens" 

/dbjcref-"taxon:9606" 

/clone- "Plate-3174 Col-6 Row-C" 

/clone_lib-"CIT Approved Human Genomic Sperm Library D" 

/sex-"male" 

/note-"Organ: sperm; Vector: pBeloBACll; BAC Clones in 
E-Coli DH10B" 
520 a 21 c 25 g 38 t 89 others 



Query Match 9.1%; Score 87.8; DB 114; Length 693; 

Best Local Similarity 44.5%; Pred. No. 2.6e-ll; 

Matches 218; Conservative 0; Mismatches 272; Indels 0; Gaps 

Qy 168 aaaatacgaaaagcacgaagagtctgaatacaaacagccaaaatatcatgaagagtaccc 227 

llll I llll llll III III I llll Mill 

Db 175 AAAAAAMAAAAAAAAAAAAAANMAAAAAAAMNAAAAAAAAAANAAAAAAAANNNANN 234 

Qy 228 aaaacatgagaagcctgaaatgtacaaggaggaaaaacaaaaaccctgcaaacatcatga^ 287 

llll I I II III I II HIM Hill II I I I 
Db 235 AAAAAAAAAAAAAAAAAAAAAAAAAAAAMAAAAAAAAAAAAAANAAAAAANNAAAAAAA 294 

Qy 288 agagtaccacgagtcacgcgaatcgaaggagcacgaagagtacgataaagaaaaacccga 347 

I II I I I I III Ml II I III Mill I 
Db 295 AAAAAMAAMAAAAMNAANAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 354 

Qy 348 tttccccaaatgggaaaagcctaaagagcacgagaaacacgaagtcgaatatccgaaaat 407 

III III ill II I III II II I Mil 
Db 355 AANNAMAAAANNAAAANNMNAAAAAAAMAAAAAMNAAAAAAAAAAAAAAAAAAAAA 414 

Qy 408 acccgagtacaaggacaaacaagatgagaataagaaacataaagatgaagagtgccagga. 467 

I I III III II I I II II III III IN I .. 
Db 415 AAAANAAANAAAAAAAAAAAAAAAAAAAAAAAAAAAAANAAANAAAAAAAAANAANAAAN 474 

Qy 468 gtcacacgaatcgaaagagcacgaagagtacgagaaagaaaaacccgatttccccaaatg'' 527 

I Ml III II II I I III II II I III '■ 
Db 475 NAAAAAAAAAAAAAAAAAAAAMAAMAAAAAAAAAAAAMAAMAAAAAAAAAAAAAAA ■• 5 3 4 

Qy 528 ggaaaagcctaaagggcacgagaaacataaagccgaatatccgaaaatacctgagtgcaa 587 

llll III II I I I III II I llll I I II 
Db 535 AAAAAAAAAAAAAAAAAAAAAAMANMAAAAAAAAAAAAAAAAAAAAAANNAAAAAAAA 594 

Qy 588 ggaaaaactagatgaggataaggaacataaacatgagttcccaaagcatgaaaaagaaga 647 

Ml I II Mill Mill I llll Hill II I 
Db 595 ANANAANAAAAAAAAAAAAAAAAAAAAAAAAAAAAAMNNAAAAAAAAAAAAAAAAAAAA 654 

Qy 648 ggagaagaaa 657 

I II III 
Db 655 AAAAAAAAAA 664 



CNS0122R 

LOCUS CNS0122R 839 bp DNA GSS 26-JUL-1999 

DEFINITION Drosophila melanogaster genome survey sequence SP6 end of BAC 

BACN07E20 of DrosBAC library from Drosophila melanogaster (fruit 
fly), genomic survey sequence. 
ACCESSION AL101037 
VERSION AL101037.1 GI: 5612648 
KEYWORDS GSS, 
SOURCE fruit fly. 
ORGANISM Drosophila melanogaster 

Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 
Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; 
Muscomorpha; Ephydroidea; Drosophilidae; Drosophila. 
1 (bases 1 to 839) 



TITLE Direct Submission 
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Db 634 AMMMMMGTSTMCMCCGTMYKKMMMMhMAAWMGTBASTMYMTMMCKMMKBCYCMMGAA 575 

Qy 151 ctgccacaattggcttcaaaatacgaaaagcacgaagagtctgaatacaaacagccaaaa 210 

::: : :: : ■■■■l :|: II I : :|l I III I llll 

Db 574 MKKTSTTGHKMTWMMMMMMMAAMMTTCMMAMAAW 515 

Qy 211 tatcatgaagagtacccaaaacatgagaagcctgaaatgtacaaggaggaaaaacaaaaa 270 

i i in i iiii i i ii iii rn i urn him 

Db 514 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAMAAAAAAAAAAAMAAAAAAAAAAAAAAA 455 

Qy 271 ccctgcaaacatcatgaagagtaccacgagtcacgcgaatcgaaggagcacgaagagtac 330 

III I I II I I I I I II II I I II I I 

Db 454 AAAAAAAAAAAAAAAAAAAAAAAAAANAAANAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 395 

Qy 331 gataaagaaaaacccgatttccccaaatgggaaaagcctaaagagcacgagaaacacgaa 390 

I III HMI I III MM III I I I III I II 
Db 394 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 335 

Qy 391 gtcgaatatccgaaaatacccgagtacaaggacaaacaagatgagaataagaaacataaa 450 

II I llll I I I II I III II I I II II III I III 
Db 334 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 275 

Qy 451 gatgaagagtgccaggagtcacacgaatcgaaagagcacgaagagtacgagaaagaaaaa 510 

llll I I I I II III I I II I I I III lllll 
Db 274 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAMAAAAAAAAAAAAAAAAAAAAAA 215 

Qy 511 cccgatttccccaaatgggaaaagcctaaagggcacgagaaacataaagccgaatatccg 570 

I III llll III I lllll III II I 
Db 214 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 155 

Qy 571 aaaatacctgagtgcaaggaaaaactagatgaggataaggaacataaacatgagttccca 630 

llll I II lllll I II MUM Ml I I I 
Db 154 AAAAANAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 95 

Qy 631 aagcatgaaaaagaagaggagaagaaacctgagaaa 666 

II I lllll III III III III 
Db 94 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAANNAAA 59 



RESULT 4 
B12963/C 

LOCUS B12963 759 bp DNA GSS 14-MAY-1997 

DEFINITION T23D1-T7.1 TAMO Arabidopsis thallana genomic clone T23D1, 

genomic survey sequence. 
ACCESSION B12963 



B12963.1 61:2094085 
GSS. 

SOURCE thale cress. 
ORGANISM Arabidopsis thaliana 

Eukaryota; Viridiplantae; Embryophyta; Tracheophyta; Spermatophyta; 
Magnoliophyta; eudicotyledons; Rosidae; eurosids II; Brassicales; 
Brassicaceae; Arabidopsis. 
1 (bases 1 to 759) 

AUTHORS Feng, J,, Dewar,K., Buehler,E., Kim,C, Li,Y., Shinn,P. , Sun^H. and 
Ecker,j, 

BAC End Sequences at ATGC 
Unpublished (1997) 

On Dec 15, 1999 this sequence version replaced gi:4123328. 
Other.GSSs: T23D1-Sp6.1, T23D1-Sp6, T23D1-T7 
Contact: Ecker J. 

Arabidopsis Thaliana Genome Center 
University of Pennsylvania 

Dept. of Biology, University of Pennsylvania, Philadelphia, PA 
19104 

Tel: 215-898-9384 
Fax: 215-898-8780 

Email : j ecker Gatgenome , bio . upenn . edu 
Seq primer: T7 
Class: BAC ends 

High quality sequence start: 88 
High quality sequence stop: 127, 
Location/Qualifiers 



TITLE 
JOURNAL 
COMMENT 



FEATURES 



BASE COUNT 
ORIGIN 



1, .759 

/organism-'Arabidopsis thaliana" 

/strain- 'Columbia" 

/dbjtref-"taxon:3702* 

/clone-"T23Dl" 

/clone_lib-"TAMU" 

/sex- "hermaphrodite" 

/note-"Vector: BeloBACII; Site J: Hindlll; SiteJ: 
Hindlll; Produced by Rod Wing" 
a 40 c 44 g 427 t 219 others 



Query Match 9.44; Score 90.8; DB 120; Length 759; 

Best Local Similarity 33.74; Pred. No. 4,9e-12; 
Matches 209; Conservative 0; Mismatches 412; Indels 0; i 

Qy 194 aatacaaacagccaaaatatcatgaagagtacccaaaacatgagaagcctgaaatgtaca 25! 

II I III I llll I I II I I II I I III I 
Db 756 AANAAAAAAAAAAAAAAAAAAANAAAAAAAAANANNAAANNAAAANAAAAAAAANNNNAA 69' 

Qy 254 aggaggaaaaacaaaaaccctgcaaacatcatgaagagtaccacgagtcacgcgaatcga 31! 

I I II I lllll I III I II I I I I I II I 
Db 696 MAAAMMNAAAAAAANAATAAAAAAAANNNAAANAANAAAAAAAAAAAAAAAAAAANA 63' 

Qy 314 aggagcacgaagagtacgataaagaaaaacccgatttccccaaatgggaaaagcctaaag 37: 

I II I I III lllll I I III llll II 
Db 636 NAMAAMAAMANANNAAAAAAAAAAAAACAAAAANNNAAAAAAAAAAAAAAAAANAAN 57' 

Qy 374 agcacgagaaacacgaagtcgaatatccgaaaatacccgagtacaaggacaaacaagatg 43: 

I I I II I I III III I I II I III II 
Db 576 AAAAAAAMAAAAAANMAAAAMANNMNMAAAANAAAAAAAAAAAAAAAAANAAANN 51' 

Qy 434 agaataagaaacataaagatgaagagtgccaggagtcacacgaatcgaaagagcacgaag 49: 

I II II III I I I I II I II I II III I II 
Db 516 AAAMAAAAAAAAAMAAAAAAAAAAAAAAAAAANNACNAANAANNNAAANNAAAAAAAA 45' 

Qy 494 agtacgagaaagaaaaacccgatttccccaaatgggaaaagcctaaagggcacgagaaac 55: 

I I I III lllll I II UN III II I' 
Db 456 AAAMNAAAAAAAAAAANAMMNNA1MAAMNMAAAAAAAAAAAAAAAAAAANMAN' 39' 

Qy 554 ataaagccgaatatccgaaaatacctgagtgcaaggaaaaactagatgaggataaggaac 61: 
I I II III I I I lllll I I I I II II 

Db 396 ANNANNNNAAANNANNANAAAAAAAAAAAAAANAAAAAAAAAAAAAAAAAAAAAAAAAAA 33' 

Qy 614 ataaacatgagttcccaaagcatgaaaaagaagaggagaagaaacctgagaaaggcatag 67: 

I III I I III I lllll III III III lllll 
Db 336 MAAAAAAJ^MAAAAAAAAAAAIAAAJ™ 27' 

Qy 674 taccctgagtgggttaaaatgcctgaatggccgaagtccatgtttactcagtctggctcg 73: 

llll 

Db 276 NNAANNNNAAANAAAAAAAANNNNNNNNNNNNNNNNNNNNNNANNNNANNNNNNNNNNNN 21' 

Qy 734 agcactaagccttaagccatatgacactggtgcatgtgccatcatcatgcagtaatttca 79: 
I I 

Db 216 NNNNNNNNNCNNMNNNNANNNNNMNNNNNNNNNNNNANNNNNNNNNNNNNNNNNNNNN 15' 



794 tgggatattgtaattatattg 814 

II I I 

156 NNNNNNNNNNNNNNCATCTCG 136 



RESULT 5 
CNS00HGZ 

LOCUS CNSOOHGZ 1101 bp DNA GSS 03-JUN-1999 

DEFINITION Drosophila melanogaster genome survey sequence T7 end of BAC: 

■ BACR35012 of RPCI-98 library from Drosophila melanogaster (fruit 
fly), genomic survey sequence. 

ACCESSION AL073472 

VERSION AL073472.1 GI:4953252 

KEYWORDS GSS. 

SOURCE fruit fly. 

ORGANISM Drosophila melanogaster 
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gb_gssl3:< 
gb„gssl4 : * 
gb_gssl5:< 
gb_gssl6:< 
gb_gssl7:' 
gb_gssl8:< 
gb_gssl9:< 
em_gssl3:' 



Pred, No. is the number of results predicted by chance to have a 
score greater than or. equal to the score of the result being printed, 
and is derived by analysis of the total score distribution, 



Result 
No. 


Score 


Query 

Match Length DB id 


Description 




1 


95.8 


9.9 


593 


122 


CNS00880 


AL051540 Drosophil 


c 


2 


94,2 


9.7 


997 


122 


CNS005TE 


AL060767 Drosophil 


c 


3 


93,8 


9.7 


796 


122 


CNS0118D 


AL099943 Drosophil 


c 


4 


90.8 


9,4 


759 


120 


B12963 


B12963 T23D1-I7.1 




5 


90.6 


9.4 


1101 


122 


CNSO0HGZ 


AL073472 Drosophil 




6 


88.2 


9,1 


952 


123 


CNS014BF 


AL103941 Drosophil 




7 


87.8 


9,1 


693 


114 


AQ782441 


AQ782441 HSJ174.A 




8 


87.6 


9.1 


839 


122 


CNS0122R 


AL101037 Drosophil 


c 


9 


87 


9,0 


1042 


123 


CNS0148K 


AL103838 Drosophil 


c 


10 


86.8 


9,0 


1198 


120 


B08337 


B08337 T19F9-Sp6.1 


c 


11 


85.8 


8.9 


870 


101 


AQ330286 


AQ330286 nbxb0046J 




12 


85.6 


8.9 


817 


122 


CNS009FM 


AL053514 Drosophil 


c 


13 


85.4 


8.8 


736 


122 


CNS009DE 


AL053636 Drosophil 


c 


14 


85.2 


8,8 


982 


101 


AQ325799 


AQ325799 nbxb002lB 




15 


84.8 


8.8 


1101 


123 


CNS0153V 


AL104965 Drosophil 


c 


16 


84.6 


8,7 


1223 


120 


B12981 


B12981 T24D11-Sp6 




17 


84.4 


8.7 


791 


122 


CNS009KS 


AL053801 Drosophil 




18 


83.8 


8.7 


822 


114 


AQ752069 


AQ752069 HS.5570J 




19 


83.8 










AL050945 Drosophil 




20 


83.6 


8^6 


710 


71 


AW349204 


AW349204 GM210004A 




21 


83.2 


8.6 


1046 


122 


CNS00ZKO 


AL097794 Drosophil 


c 


22 


83,2 


8,6 


1059 


122 


CNS00Z2B 


AL097133 Drosophil 


c 


23 


82.8 


8.6 


732 


96 


AQ257374 


AQ257374 nbxb0018K 


c 


24 


82.4 


8,5 


569 


101 


AQ329762 


AQ329762 nbxb0045P 


c 


25 


82.4 


8.5 


956 


101 


AQ330169 


AQ330169 nbxb0046L 


c 


26 


81,8 


8,5 


840 


96 


AQ288571 


AQ288571 nbxb0033l 




27 


81.8 


8.5 


858 


122 


CNS0127J 


AL101209 Drosophil 




28 


81.6 


. 8.4 


507 


91 


W82081 


W82081 me96h06.rl 


c 


29 


81,6 


8.4 


870 


116 


AQ866797 


AQ866797 nbeb0029E 


c 


30 


81,4 


8.4 


865 


96 


AQ324474 


AQ324474 mgxb0018B 


c 


31 


81.4 


8.4 


1044 


122 


CNSO0K3G 


AL077176 Drosophil 


c 


32 


81,2 


8.4 


864 


64 


AW155256 


AW155256 mgie0002P 




33 


80.8 


8.4 


830 


122 


CNS0118J 


AL099949 Drosophil 


c 


34 


80.6 


8.3 


506 


122 


CNS0Q9K4 


AL053777 Drosophil 


c 


35 


80.6 


8.3 


844 


120 


B10796 


B10796 T26G15-Sp6 


c 


36 


80.6 


8.3 


1101 


122 


CNSO0OSX 


AL050813 Drosophil 


c 


37 


80.6 


8.3 


1101 


122 


CNSO0Z15 


AL097091 Drosophil 


c 


38 


80.2 


8.3 


1147 


120 


B13042 


B13042 T30M24-Sp6. 


c 


39 


80 


8.3 


949 


101 


AQ325830 


AQ325830 nbxb0021F 


c 


40 


79.6 


8,2 


700 


46 


AI906328 


AI906328 PM-BT107- 


c 


41 


79.6 


8.2 


815 


116 


AQ853920 


AQ853920 nbxb0046G 


c 


42 


79.6 


8.2 


968 


113 


AQ687544 


AQ687544 nbxb0075l 




43 


79.2 


8.2 


776 


122 


CNS009BD 


AL053563 Drosophil 




44 


78,8 


8,1 


952 


117 


AQ897460 


AQ897460 HSJ13U 


c 


45 


78.6 


8.1 


833 


102 


AQ446640 


AQ446640 nbxb0070F 



593 bp DNA GSS 03-JUN-1999 

DEFINITION Drosophila melanogaster genome survey sequence TET3 end of BAG * 
BACR16J23 of RPCI-98 library from Drosophila melanogaster (fruit 



fly), genomic survey sequence, 
ACCESSION AL051540 
VERSION AL051540.1 GI: 4933381 



TITLE 
JOURNAL 



SOURCE fruit fly. 
ORGANISM Drosophila melanogaster 

Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 
Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; 
Muscomorpha; Ephydroidea; Drosophilidae; Drosophila. 
1 (bases 1 to 593) 
Genoscope. 
Direct Submission 

Submitted (02-JUN-1999) Genoscope • Centre National de Sequencage : 
BP 191 91006 EVRY cedex ■ FRANCE (E-mail ; seqref@genoscope,cns.fr 
- Web : www.genoscope.cns.fr) 

Determination of this BAC-end sequence was carried out as part of a 
collaboration with the Berkeley Drosophila Genome Project (BDGPj . 
The BDGP is constructing a physical map of the Drosophila 
melanogaster genome using these BACs. For further information 
please see http://www.fruitfly.org The BDGP Drosophila 
melanogaster BAC library was prepared by Kazutoyo Osoegawa and 
Aaron Mammoser in Pieter de Jong's laboratory in the Department of 
Cancer Genetics at the Roswell Park Cancer Institute in Buffalo, 
NY, The library is named RPCI-98 and was constructed by partial 
EcoRI digestion of Drosophila DNA provided by the BDGP from the 
isogenic strain y2; cn bw sp, the same strain used for the BDGP's 
Pi and EST libraries . A more detailed description of the- library 
and how to order individual BAC clones, the entire library, or 
filters for hybridization from the BACPAC Resource Center can be 
found at http://bacpac.med. buffalo.edu/drosophila_bac, htm. 
FEATURES Location/Qualifiers 
source 1, ,593 

/organ ism- " Drosophila melanogaster " 
/db_xref-"taxon:7227" 
/cloneJib-"RPCl-98" 
/clone-"BACRl6J23" 
/note- "end : TET3" 
25 c 18 g 



BASE COUNT 
ORIGIN 



448 a 



10 t 92 others 



Query Match 9,9*; Score 95.8; DB 122; Length 593; *• 

Best Local Similarity 41.9%; Pred. No. 2.9e-13; 

Matches 224; Conservative 58; Mismatches 250; Indels 3; Gaps 1; 

Qy 132 acaaacaacctcatcagagctgccacaattggcttcaaaatacgaaaagcacgaagagtc 191 

I III II II I: II : llh:l INI I II I 

Db 39 AAAMTAAAACMMGGAAAAAAMCMAAAAAAAMAAAAMWWAWAAAAAAAAAAAAAAAMA 98 

Qy 192 tgaatacaaacagccaaaatatcatgaagagtacccaaaacatgagaagcctgaaatgta 251 

I 11:1: : : |:| : I I: II II Ihl I : II Ml I 
Db 99 AMATMCMAMAMMMACAMAAMAAAAAAMAAGGACAAAAMAAAMAMMAACAAAAAAAAAAA 158 

Qy 252 caaggaggaaaaacaaaaaccctgcaaacatcatgaagagtaccacgagtcacgcgaatc 311 

II I lllll Mill : : III I I I: I Ihl I I h 
Db 159 AAAACAAAAAAAAAAAAAAAMACRRAAAAAAAAAAAMAAAAACMAAAAAAAAAAAAAMAA 218 

Qy 312 gaaggagcacgaagagtacgataaagaaaaacccgatttccccaaatgggaaaagcctaa 371 

:: I I II I I I III I III : I hi 'I::: II 

Db 219 AMMAAAAAAAAAAAAMAAAAAAAAAAGAGAACRMAAAAMAAAAAMAAAAAAAAMRMAAAA 278 

Qy 372 agagcacgagaaacacgaagtcgaatatccgaaaatacccgagtacaaggacaaacaaga 431 

I I :| I III : I: II : : ::ll h: :| I II I III h:l 

Db 279 AAAAMAAAAAAAAAMAMAfflCAMAAAMAAMAMRAAAAMMRRAAAAAAAMAAAAAAAARRA 338 

Qy 432 tgagaataagaaacataaagatgaagagtgccaggagtcacacgaatcgaaagagcacga 491 

I II I : : f 1 1 : 'III I |:: I I I I I h III I : I 
Db 339 MAAAAAAAMRAAAMGGAAAAAGGMMAAARAAAAMAAMAAAAAATAMAAGAACAAATMAAA 398 

Qy 492 agagtacgagaaagaaaaacccgatttccccaaatgggaaaagcctaaagggcacgagaa 551 

II I I III llllll : :| :|| hi I II lllll I II 

Db 399 AAAMAAAAAAAAAAAAAAACAARAAAAAMCAMAAMAAAAMACGAAAAAGGGGCAAAAAAA 458 
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inn iiiiii i n n i m i n i i i iiimi i i 

576 AGAAAATGAAGATGAAGAAAAAAAAGAAAAAGAAGAAGAACAAGAAGATGAAAAAATATA 635 

648 ggagaagaaacctgagaaaggcatagta 675 

I I III III llll 
636 TGTTGAAAAAGAAAAAGATGAAGAAGTA 663 



RESULT 14 
T05868 

ID T05868 Standard; DNA; 3399 BP, 

AC T05868; 

DT 14-ADG-1996 (first entry) 

DE Chicken leucocytozoan DNA encoding immunogenic protein for vaccines . 

KW Chicken leucocytozoan; immunogen; recombinant vaccine; protection; 

KW immunisation; vaccination; ss. 

OS Chicken leucocytozoan. 

FH Key Location/Qualifiers 

FT cds 1. .3399 

FT /*tag- a 

FT miscjeature 1150. .3218 

FT /*tag- b 

FT /note- "fragment referred to in the claims, for 

FT use as Insert in a recombinant vaccine 

FT against chicken leucocytozoan disease" 

PN J07284392-A. 

PD 31-OCM995. 

PF 19-APR-1994; 080643. 

PR 19-APR-1994; JP-080643. 

PA (DOBU-) DOBUTSUYO SEIBUTSUGAKDTEKI SEIZAI KYOKAI, 

PA (KITA ) KITASATO KENKYUSHO SH. 

DR WPI; 96-006311/01. 

DR P-PSDB; R97866. 

PT Chicken leucocytozoan immunogenic protein ■ used in a recombinant 

PT vaccine against chicken leucocytozoan disease 

PS Claim 6; Page 6-9; 35pp; Japanese. 

CC T05868 encodes a chicken leucocytozoan immunogenic protein, this DNA 

CC or a frapent of it can be used in a recombinant vaccine to immunise 

CC against chicken leucocytozoan disease. The DNA is used in a vector 

CC and operatively linked to an expression regulatory sequence as in 

CC standard practice. 

SO Sequence 3399 BP; 1577 A; 508 C; 798 G; 516 T; 



Query Match 9.1%; Score 88.4; DB 1; Length 3399; 

Best Local Similarity 48.8%; Pred. No, 3.1e-13; 

Matches 239; Conservative 0; Mismatches 251; Indels 0; Gaps 0; 

Qy 168 aaaatacgaaaagcacgaagagtctgaatacaaacagccaaaatatcatgaagagtaccc 227 

II I I III I I III I III II I I II I I lllll I 
Db 2559 AACACATGAAGAAGAAGAAAAAGTAACATATGAAGAAGAAGAAGAAGAAGAAGAAAAAGT 2618 

Qy 228 aaaacatgagaagcctgaaatgtacaaggaggaaaaacaaaaaccctgcaaacatcatga 287 

II IIIIII I llll I I III II II II I llll I II 

Db 2619 AACACATGAAGAAGAAGAAAAAGTAACACATGAAGAAGAAGAAAAAGTAACACATGAAGA 2678 

Qy 288 agagtaccacgagtcacgcgaatcgaaggagcacgaagagtacgataaagaaaaacccga 347 

III I I I II III I II I lllll I II lllll II II 

Db 2679 AGAAGAAAAAGTAATACATGAAGAAGAAGAAAAAGAAGAAGATGAGGAAGAAGAAGAAGA 2738 

Qy 348 tttccccaaatgggaaaagcctaaagagcacgagaaacacgaagtcgaatatccgaaaat 407 

II II I llll I II I I llll III I II 
Db 2739 AGAAGAAGAAGAAGAGGAAGAAGAAGAAGAAGAAGATGAGGAAGAGGAAGAAGAAGAAGA 2798 

Qy 408 acccgagtacaaggacaaacaagatgagaataagaaacataaagatgaagagtgccagga 467 

I II I I II II llll II I II I I II lllll I I 
Db 2799 AGAAGATGAGGAAGAGGAAGAAGAAGAAGAAAATGAGGAAGAAGAAGAAGAAGAAAATAA 2858 

Qy 468 gtcacacgaatcgaaagagcacgaagagtacgagaaagaaaaacccgatttccccaaatg 527 

I I I III llll I IIIIII I II lllll llll II 
Db 2859 GGAAGAAGAAGAAGAAGAAAAAGAAGAGCATGAAGAAGAAGTAACACATGAAGAAGAAGA 2918 

Qy 528 ggaaaagcctaaagggcacgagaaacataaagccgaatatccgaaaatacctgagtgcaa 587 



lllll II I II II I llll I II III I I 
Db 2919 AGAAAAAGTAACACATGAAGAAGAAGAAAAAGTAACACATGAAGAAGAAGAAAATGTAAC 2978 

Qy 588 ggaaaaactagatgaggataaggaacataaacatgagttcccaaagcatgaaaaagaaga 647 

I II III II II IN I llll III I I lllll 
Db 2979 ATATGAAGAAGAAGAAGAAAAAGTAACACATGAAGAAGAAGAAAAAGTAACACATGAAGA 3038 

Qy 648 ggagaagaaa 657 

II I III 
Db 3039 AGAAGAAAAA 3048 



RESULT 15 
X33181 

ID X33181 standard; DNA; 6644 BP. 

AC X33181; 

DT 25-JUN-1999 (first entry) 

DE Base sequence of the plasmid pRx- ires -bsr. 

KW Cowpox virus; bsr; viral vector; expression; apoptosis; resistance; 

KW crmA; bcl-2; bcl-xl; FLIP; survivin; IAP; ILP; adenovirus; cancer; 

KW autoimmune disease; graft rejection reaction; inflammation; 

KW inflammatory disease; ' ss , 

OS Synthetic. 

OS Cowpox virus. 

PN WO9913073-A2. 

PD 18-MAR-1999. 

PF 07-SEP-1998; J04010. 

PR 08-SEP-1997; JP-259235. 

PA (RPRG-) RPR GENCELL ASIA PACIFIC INC. 

PI Hamada H; 

DR WPI; 99-243728/20. 

PT New apoptos i s - res is tant virus-sensitive cell 

PS Example 1; Page 38-41; 51pp; English. 

CC The present invention describes an apoptos is - res is tant virus -sensitive 

CC cell line into which an apoptosis resistance gene has been introduced. 

CC The recombinant viruses generated are capable of expressing apoptos is - 

CC associated genes. These can then be used in a variety of diseases for 

CC which the induction of apoptosis by gene transfer, or where the 

CC inhibition of harmful apoptosis, is therapeutic. The recombinant viruses 

CC are useful as vectors for gene therapy which can be applied to cancer 

CC therapy for destroying cancer cells selectively, the treatment of 

CC autoimmune diseases and graft rejection reaction, and apoptosis induction 

CC therapy for inflammatory cells in inflammatory diseases. Prior arts have 

CC encountered the problem where if an adenovirus vector capable of 

CC expressing an apoptos is -associated gene is introduced into animal cells, 

CC the cells producing the virus will be destroyed because the period of 

CC time required to induce cell death by apoptosis is shorter than that 

CC required to replicate and produce the virus, resulting in failure to 

CC obtain a recombinant virus having the integrated apoptosis-associated 

CC gene. In this Invention an apoptosis -resistant 293 cell line (having an 

CC apoptosis resistant gene introduced) is established and overcomes the 

CC problem. The present sequence represents the base sequence of the 

CC plasmid pRx-ires-bsr, which contains the cowpox virus bsr gene, and 

CC is used in an example from the present invention. 

SQ Sequence 6644 BP; 2166 A; 1573 C; 1424 G; 1481 T; 



Query Match 8.71; Score 84,4; DB 1; Length 6644; 

Best Local Similarity 47,4%; Pred. No. 3.9e-12; 

Matches 253; Conservative 0; Mismatches 281; Indels 0; Gaps 0; 

Qy 133 caaacaacctcatcagagctgccacaattggcttcaaaatacgaaaagcacgaagagtct 192 

III llll I II III I II llll I llll I II I 
Db 3717 CAATAMCCCTCTTGCAGTTGCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 3776 

Qy 193 gaatacaaacagccaaaatatcatgaagagtacccaaaacatgagaagcctgaaatgtac 252 

II I III I llll IIIIII llll I I II III I 
Db 3777 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 3836 

Qy 253 aaggaggaaaaacaaaaaccctgcaaacatcatgaagagtaccacgagtcacgcgaatcg 312 

II I lllll lllll III I I II I I I I I II 

Db 3837 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 3896 
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PT DNA construct contg. gene of interest controlled by cotton fibre 

PT transcriptional factor - used to produce altered phenotype cotton 

PT fibre cells expressing genes affecting pigmentation 

PS Example 5; Fig 3A-J; 95pp; English. 

CC The present sequence is a 4*4 cotton fibre expression cassette (version 

CC II) from promoter construct pCGN5610. The lambda genomic phage clone used 

CC to form this construct was designated 4-4(6). DNA constructs containing 

CC cotton fibre- specific transcriptional factor promoters are useful to 

CC produce cotton fibre cells with altered phenotype/ especially altered 

CC colour. Genes involved in the production of melanin (e.g. tyrosinase 

CC gene and ORF438 encoded protein from Streptomyces antibioticus) and 

CC indigo (mono-oxygenase genes possibly in conjunction with a 

CC tryptophanase gene) are of interest. The promoters of the invention are 

CC reliable and permit expression of a protein selectively in cotton fibre 

CC to affect qualities such as fibre strength, length, colour and dyabllity 

CC as required. The construct and methods can also be used for the 

CC introduction of other advantageous genes into a cotton plant, e.g. a 

CC plant hormone. In particular, fibres from a plant producing coloured 

CC fibres may be used to produce yarns and/or fabrics that do not require 

CC dyeing. 

SO Sequence 5518 BP; 1886 A; 794 C; 815 G; 2022 T; 



Query Match 28.54; Score 276; DB 1; Length 5518; 

Best Local Similarity 86 . 6%; Pred. No. 3e-60; 

Matches 322; Conservative 0; Mismatches 35; Indels 15; Gaps 1; 

Qy 1 ctttctatttggttaaccatggctcataactttcgtcatcctttcttccttttccaactt 60 

MINIUM IMIIIIMIMIM I 1 1 1 1 1 1 1 IIIIIIIIIMIIMIIIMI 
Db 4139 CTTTCTATTTGATTAACCATGGCTCATAGCATTCGTCACCCTTTCTTCCTTTTCCAACTT 4198 

Qy 61 ttactcattactgtctcactaatgatcggtagccacaccgtctcgtcagcggctcgacat 120 

iiiniii i MiiiiiM iii imimiiii n in iiiiiiiiim i 

Db 4199 TTACTCATAAGTGTCTCACTAGTGACCGGTAGCCACACTGTTTCGGCAGCGGCTCGACGT 4258 
Qy 121 ttattccacacacaaacaacctcatcagagctgccacaattggcttcaaaatacgaaaag 180 

linn i mill iiiiiiiiiiiimi miiiiiiiiiiiiiiiniiiiiii 

Db 4259 TTATTCGAGACACAAGCAACCTCATCAGAGCTCCCACAATTGGCTTCAAAATACGAAAAG 4318 

Qy 181 cacgaagagtct gaatacaaacagccaaaatatcatgaagagtac 225 

llllllllllll M 1 1 1 1 1 1 1 1 1 II 1 1 1 ! Illll lllllllll 

Db 4319 CACGAAGAGTCTGAATACGAAAAGCCAGAATACAAACAGCCAAAGTATCACGAAGAGTAC 4378 

Qy 226 ccaaaacatgagaagcctgaaatgtacaaggaggaaaaacaaaaaccctgcaaacatcat 285 

llllll llllllllllllllll I 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 M 1 1 1 1 M 1 1 III 
Db 4379 TCAAAACTTGAGAAGCCTGAAATGCAAAAGGAGGAAAAACAAAAACCCTGCAAACAGCAT 4438 

Qy 286 gaagagtaccacgagtcacgcgaatcgaaggagcacgaagagtacgataaagaaaaaccc 345 

llllllllllllllinil llllll llllll lllllllll! MINI I I 
Db 4439 GAAGAGTACCACGAGTCACACGAATCAAAGGAGCAAAAAGAGTACGAGAAAGAAAATCTC 4498 



346 gatttccccaaa 357 

II III II 
4499 GACGGGCCCGAA 4510 



RESULT 11 
T73865 

ID T73865 standard; DNA; 5547 BP. 

AC T73865; 

DT 26-JAN-1998 (first entry) 

DE Cotton fibre promoter clone 4-4(6) construct, pCGN5606 (Version I), 

KW promoter; fibre-specific; transcriptional factor; promoter; 

KW altered phenotype; colour; melanin; indigo; ss. 

OS Gossypium hirsutum cv. coker 130. 

FH Key Location/Qualifiers 

FT miscjeature 1. .65 

FT /*tag- a 

FT /note- "fragment of pBluescriptll polylinker. (as 

FT stated in the specification)" 

FT miscjeature 57, .5494 
FT /*tag- b 

FT /note- "genomic clone 4-4(6) from lambda phage clone of 



FT a cotton Coker 130 genomic library (as stated in 

FT the specification)" 

FT misc_RNA 65. .4163 

FT /*tag- c 

FT /note- "5' flanking region of the 4-4(6) gene (as 

FT stated in the specification)" 

FT CDS 4163. .4502 

FT /*tag- d 

FT /note- "corresponds to part of the 4-4(6) ORF (as 

FT stated in the specification)" 

FT CDS ' complement (4131, ,4502) 

FT /*tag- i 

FT /transl_except- (pos:4170, .4172, aa:Xaa) 

FT /transl.except- (pos:4182. .4184, aa:Xaa) 

FT /note- "Xaa ■ stop codon; No start or stop codons 

FT given, possibly conforms to exon structure. 

FT Encodes W21899" 

FT miscjeature 4502. .4555 

FT /*tag- e 

FT /note- "synthetic polylinker oligonucleotide containing 

FT unique target sites for EcoRI, Smal, Sail, Nhel 

FT and Bglll" 

FT miscjeature 4163. .4555 

FT /*tag- f ,5 

FT /note- "stuffer fragment left in place to facilitate the 

FT monitoring of cloning manipulations (as. stated in 

FT the specification)" 

FT 3'UTR 4555. ,5494 

FT /*tag- g 

FT /note- "corresponds to the 940 nucleotides downstream of 

FT the stop codon and constitutes the 3' flanking 

FT region of the 4-4(6) gene (as stated in the 

FT specification)" 

FT miscjeature 5494, .5547 

ft /*tag- h 

FT /note- "fragment of pBluescriptll polylinker (as stated 

FT in the specification)" 

PN WO9640924-A2. 

PD 19-DEC-1996. ^ 

PF 07-JUN-1996; 009897, 

PR 07-JUN-1995; OS-480178. 

PR 01-JUL-1996; ZA-005572. 

PA (CALJ ) CALGENE INC. 

PI Mcbride K, Pear JR, Perez-Grau L, Stalker DM; 

DR WPI; 97-052325/05. 

DR P-PSDB; W21899. 

PT DNA construct contg. gene of interest controlled by cotton fibre 

PT transcriptional factor - used to produce altered phenotype cotton 

PT fibre cells expressing genes affecting pigmentation 

PS Claim 22; Fig 2A-J; 95pp; English, 

CC The present sequence is a 4-4 cotton fibre expression cassette (version 

CC I) from promoter construct pCGN5606. The lambda genomic phage clone used 

CC to form this construct was designated 4-4(6). DNA constructs containing 

CC cotton fibre-specific transcriptional factor promoters are useful to 

CC produce cotton fibre cells with altered phenotype, especially altered 

CC colour. Genes involved in the production of melanin (e.g. tyrosinase 

CC gene and ORF438 encoded protein from Streptomyces antibioticus) and 

CC indigo (mono-oxygenase genes possibly in conjunction with a 

CC tryptophanase gene) are of interest. The promoters of the invention are 

CC reliable and permit expression of a protein selectively in cotton fibre 

CC to affect qualities such as fibre strength, length, colour and dyability 

CC as required. The construct and methods can also be used for the 

CC introduction of other advantageous genes into a cotton plant, e.g. a 

CC plant hormone. In particular, fibres from a plant producing coloured 

CC fibres may be used to produce yarns and/or fabrics that do not require 

CC dyeing. 

SQ Sequence 5547 BP; 1889 A; 808 C; 822 G; 2028 T; 



Query Match 27.7*; Score 268; DB 1; Length 5547; 

Best Local Similarity 86.7%; Pred. No. 3.1e-58; 

Matches 312; Conservative 0; Mismatches 35; Indels 13; Gaps 1; 
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Qy 


527 


Db 


470 


Qy 


587 


Db 


530 


Qy 


647 


Db 


590 


Qy 


707 


Db 


650 


Qy 


767 


Db 


710 


Qy 


827 


Db 


770 


Qy 

Db 


886 
830 


Qy 


946 


Db 


890 



iiiiiiiiiiiiiiiiiiiiiiiiiiiiiimiiiiiiiiiiiiiiriiiiiiMiiii 



lllllllll!!l!ill!l!!MII!ll!!l!l!ill!!!llll!lll!l!l!l!l!ll!l 



MIIMMMMIMIMIMMIIIIMMMIIMMMMMMMIIIIIMIMI 



IIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 



[ [ I ! 1 1 [ 1 1 ! I! U ! ! i 1 1 1 1 1 1 1 1 1 ] [ 1 1 1 1 1 1 M 1 1 1 M 1 1 1 f [ I 



IIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIII! Illlllllllllllllllllll 



tgcatagagattctgaatggttatagtttatgttatatcgtttgttctagtgaaattaat 945 

iiiiiniiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiMiiiiiii 



IIMIIIIIIIIIillllEII! 
TTTGAATGTTGTATGTAATGTT 911 



RESULT 7 
T43360 

ID T43360 standard; DNA; 3974 BP. 

AC T43360; 

DT ll-MAR-1997 (first entry) 

DE Cotton FbLate2-82A gene and promoter. 

KW FbLate; promoter; fibre; transgenic plant; cotton; ds. 

OS Gossypium hirsutum var. Sea Island. 

FH Key Location/Qualifiers 

FT promoter 1. .2315 

FT /*tag- a 

FT /note- "the FbLate promoter located between 

FT bases 1 and 2315 is preferred for use in 

FT constructs of the invention' 

FT cds 2315. .3379 

FT /*tag- b 

FT /product- unidentified protein 

PN WO9639021-A1, 

PD 12-DEC-1996. 

PF 06-JUN-1996; U09449. 

PR 06-JUN-1995; US-467504. 

PA (MONS' ) MONSANTO CO. 

PI John ME; 

DR WPI; 97-042726/04. 

PT Plant fibre-specific, developmentally regulated FbLate promoter • 

PT useful for producing transgenic plants, esp. cotton, with altered 

PT fibre properties 

PS Claim 4; Page 57-59; 79pp; English. 

CC A 3974 bp region (143360) of clone pSKSIFbLate2-28A includes the 

CC fibre-specific FbLate promoter that is active during late fibre 

CC development, plus a coding sequence (FbLate-82A) for an unknown 

CC protein. The clone was obtd. from a fibre genomic library using a 

CC cDNA clone (see also T43362) that corresponds to RNA prevalent in 

CC late fibre development, and insertion of an isolated clone into 

CC Bluescript SK+ vector. The FbLate promoter can be used for tissue- 

CC and developmental-specific expression of fibre and non-fibre 

CC proteins (e.g. polyhydroxybutyrate biosynthetic enzymes) in 

CC transgenic plants, esp. to alter the fibre characteristics of 

CC cotton. >, ■ 

3974 BP; 1523 A; 603 C; 597 G; 1251 T; 



Query Match 53.1%; Score 513,4; DB 1; Length 3974; 

Best Local Similarity 85.94; Pred. No. 7,le-120; 

587; Conservative 0; Mismatches 81; Indels 15; Gaps 1; 

tttggttaaccatggctcataactttcgtcatcctttcttccttttccaacttttactca 67 

II llllllllllllllllllll llllll llllllllllllllllllllllllllll 



III IIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIII MINIMI! Mllll 



I lllllllllllllllll III llllllllll llllllllllllllllllll Mil 



Matches 


Qy 


8 


Db 


2304 


Qy 


68 


Db 


2364 


Qy 


128 


Db 


2424 


Qy 

Db 


188 
2484 


Qy 


233 


Db 


2544 


Qy 


293 


Db 


2604 


Qy 


353 


Db 


2664 


Qy 


413 


Db 


2724 


Qy 


473 


Db 


2784 


Qy 


533 


Db 


2844 


Qy 


593 


Db 


2904 


Qy 


653 


Db 


2964 



agtct- 

Mill 



■ -gaatacaaacagccaaaatatcatgaagagtacccaaaac 232 

IIIIIIIIIIIIMIII III I M 1 1 1 1 1 M 1 1 1 1 1 M 



I MUM MM! 1 1 llllll MIMMI Mill II I 1 1 Ml II IIIMIMIL 



MMMMIMI I 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 ! I M 1 1 1 1 I llllllllllllll MM 



llllll IIIIIIIMIIIIIIIMIIIIIMIIMIIIIIIIIIIIIMIIIIIM || 



iiiiiiiiii ii 1 1 : 1 1 1 1 1 1 i in iiimm! 1 1 1 1 1 1 1 1 1 ill ii mi- 



I Mil! II I llllllllll II 1 1 III Mil! Ml 1 1 III I III I II MM Milt; 



IIIMIMI 1 1 1 ! 1 1 1 1 M 1 1 III MIMMM lllllll IMI llllllll 



I I llllllll 1 i 1 1 1 1 1 i 1 1 ! 1 1 1 1 1 II I 



III III I I I II I 



II I I III I II 



RESULT 
T43362 



8 . 



T43362 standard; cDNA; 645 BP. 
T43362; 

11- MAR-1997 (first entry) 

Cotton FbLate 2-82A gene cDNA clone All (FbLate-2). 
FbLate; promoter; fibre; transgenic plant; cotton; ds, 
Gossypium hirsutum, 
WO9639021-A1. 

12- DEC-1996. 
06-JUN-1996; 009449. 
06-JUN-1995; US-467504. 
(MONS ) MONSANTO CO, 
John ME; 

WPI; 97-042726/04. 

Plant fibre-specific, developmentally regulated FbLate promoter - 
useful for producing transgenic plants, esp, cotton, with altered 
fibre properties 
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RESULT 4 
T62624 

ID T62624 standard; cDNA to raRNA; 1283 BP. 

AC T62624; 

DT 14-HM-1997 (first entry) 

DE Cotton fibre specific cDNA clone CKFB15-E9, 

KW cotton; fibre-specific; strength; transgenic plant; anthesis; 

KW developmental^ regulated; E6; H6; antisense; sense; ss, 

OS Gossypium hirsutum strain Coker 312 . 

PN OS5597718-A. 

PD 28-JAN-1997. 

PF 04-OCT-1988; 253243. 

PR 04-OCM988; US-253243 . 

PR 21-NOV-1990; US-617239. 

PR 18-OCT-1993; US-138814. 

PR 20-SEP-1995; US-530797. 

PA (CETU ) AGRACETOS. 

PI Brill WJ, John ME, Orabeck PF; 

DR WPI; 97-108326/10. 

PT Prodn. of transgenic cotton plants • by transformation with the H6 

PT coding sequence or E6 anti-sense sequence, produces fibre of altered 

PT strength 

PS Example 4; Column 53-54; 33pp; English. 

CC T62609-24 are cotton fibre-specific cDNA clones which can be used to 

CC identify genomic clones, This clone, CKFB15-E9, is expressed in fibre 

CC cells, but is also expressed at low levels in petal, (CK - 

CC Coker; FB - Fibre; 10, 15 or 23 - age in days of fibre cells; Al and the 

CC last character and number stand for clone identity) . The fibre-specific 

CC genes were identified by differential cDNA library screenings, Coding 

CC sequences from these isolated genes are used in sense or antisense 

CC orientation to alter the fibre characteristics, e.g. strength, of 

CC transgenic fibre-producing plants. 

SQ Sequence 1283 BP; 509 A; 233 C; 251 G; 290 T; 



Query Match 75.5%; Score 730; DB 1; Length 1283; 

Best Local Similarity 82.5*; Pred. No. 2.2e-174; 

Conservative 0; Mismatches 30; Indels 169; Gaps 2; 



IIIIIIMI! II lllllllllllllllllll lllimillllllllllllllllll 



lllllllllinillllllllll II 1 IIIIIMIIIIIIIIIIIIIIIIIII 

TTACTCATTACTGTCTCACTAATAATCGGTAGTCACACCGTCTCGTCAGCGGCTCGACAT 147 



Mllllll lllllllllllllllllllllllllllllllllllllllllllllllllll 



III lllllllllllllllllll lllllllllll III llllllllllllllllllll 



III Mill llllllllllllllllllllllllllllllllllllllllllllllllll 



llllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 



IIIIIIIIIMIIIIII Illllllllllll IIMIIIIIIIIIIIIIIIIIIIIIII 



Matches 


Qy 


1 


Db 


28 


Qy 

Db 


61 
88 


Qy 


121 


Db 


148 


Qy. 

Db 


181 
208 


Qy 


241 


Db 


268 


Qy 


301 


Db 


328 


Qy 


361 


Db 


388 


Qy 


421 


Db 


448 



Qy 434 433 

Db 508 AAGGAGCACGAAGAATACGAGAAAGAAAAACCCGAGTTCCCCAAATGGGAAAAGCCTAAA 567 



Qy 434 433 

Db 568 GAGCACGAGAAACACGAAGTCGAATATCCGAAAATACCCGAGTACAAGGAAAAGCAAGAT 627 

Qy 434 -agaataagaaacataaagatgaagagtgccaggagtcacacgaatcgaaagagcacgaa 492 

III llll lllllllllllllllllllll IIIIIIIIIIMIIIIIMI llllll 
Db 628 AAGAGTAAGGAACATAAAGATGAAGAGTGCCACGAGTCACACGAATCGAAAGATCACGAA 687 

Qy 493 gagtacgagaaagaaaaacccgatttccccaaatgggaaaagcctaaagggcacgagaaa 552 

IIIIIIIIIIMIIIIIIIII Mill llllllllllllllllllll MINIUM 
Db 688 GAGTACGAGAAAGAAAAACCCAATTTCTTCAAATGGGAAAAGCCTAAAGAGCACGAGAAA 747 

Qy 553 cataaagccgaatatccgaaaatacctgagtgcaaggaaaaactagatgaggataaggaa 612 

IIIMIIIIIIIIIIM MINIM 1 1 M 1 1 1 1 1 1 1 1 1! M 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 
Db 748 CATAAAGCCGAATATCCAAAAATACCCGAGTGCAAGGAAAAACAAGATGAGGATAAGGAA 807 

Qy 613 cataaacatgagttcccaaagcatgaaaaagaagaggagaagaaacctgagaaaggcata 672 

IIIIIIIIIIIIIIIIIMIIIIinilllllllllllMIIIIIIIIIIIIIIIII I 

Db 808 GATAAACATGAGTTCCCAAAGCATGAAAAAGAAGAGGAGAAGAAACCTGAGAAAGGCAGA 867 

Qy 673 gtaccctgagtgggttaaaatgcctgaatggccgaagtccatgtttactcagtctggctc 732 
IIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIMIIIIIIIIIIMIIMIMIIII ' 
Db 868 GTACCCTGAGTGGGTTAAAATGCCTGAATGGCCGAAGTCCATGTTTACTCAGTCTGGCTC 927 

Qy 733 gagcactaagccttaagccatatgacactggtgcatgtgccatcatcatgcagtaatttc 792 

Mill IIIIMIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIMIIIIUIIIIIIt 
Db 928 GAGCATTAAGCCTTAAGCCATATGACACTGGTGCATGTGCCATCATCATGCAGTAATTTC 987 

Qy 793 atgggatattgtaattatattgttaataaaaaagatggtgagtgggaaatgtgtgtgtgc 852 

IIIIIIMI MMIIIMIIIIMIIIIIIIIIIIIIIIIIIIIIIIMIMMIIIM 

Db 988 ATGGGATATCGTAATTATATTGTTAATAAAAAAGATGGTGAGTGGGAAATGTGTGTGTGC 1047 
Qy 853 attcatccatg-agcaatgctgaatctctttgcatgcatagagattctgaatggttatag 911 

iiiiiniiii iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiim 

Db 1048 ATTCATCCATGT AGCAATGCTGAATCTCTTTGCATGC ATAGAGATTCTGAATGGTT AT AG^'l 107 

Qy 912 tttatgttatatcgtttgttctagtgaaattaattttgaatgttgtatgtaatgtt 967 

IIIIIIMIIIIMIIIMIIIIMIIIMIIIMIIIIIIIMIIII IIIIIM 

Db 1108 TTTATGTTATATCGTTTGTTCTAGTGAAATTAATTTTGAATGTTGTATCTAATGTT 1163 



RESULT 5 
T70055 

ID T70055 standard; cDNA; 1283 BP. 

AC T70055; 

DT 20-AUG-1997 (first entry) 

DE Cotton fibre specific cDNA clone E9, 

KW cotton; E6; fibre; promoter; transgenic plant; truncated; 

KW heterologous gene expression; ds. 

OS Gossypium hirsutum strain Coker 312, 

PN US5620882-A. 

PD 15-APR-1997, 

PF 04-OCT-1988; 253243. 

PR 04-OCT-1988; US-253243. 

PR 21-NOV-1990; US-617239. 

PR 18-MAM992; US-885970. 

PR 19-OCT-1994; US-298829. 

PA (CETU ) AGRACETUS INC. 

PI John M; 

DR WPI; 97-235185/21. 

PT DNA constructs contg, truncated promoter sequence - for 

PT fibre-specific gene expression in cotton plants 

PS Example 3; Column 45*48; 48pp; English, ■ 

CC T70040-57 are cotton fibre-specific cDNA clones which can be used to 

CC obtain genomic clones containing fibre-specific promoters. Claimed DNA 

CC constructs comprise a truncated promoter sequence (from one of T70031-38) 

CC that promotes preferential gene expression in plant fibre cells, a 

CC protein coding sequence not naturally associated with the promoter 

CC sequence and a 3' termination sequence, The DNA constructs are useful for 
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121 TTATTCCACACACAAACAACCTCATCAGAGCTGCCACMTTGGCTTCAAAATACGAAAAG 180 

181 cacgaagagtctgaatacaaacagccaaaatatcatgaagagtacccaaaacatgagaag 240 

181 CACGAAGAGTCTGAATACAAACAGCCAAAATATCATGAAGAGTACCCAAAACATGAGAAG 240 

241 cctgaaatgtacaaggaggaaaaacaaaaaccctgcaaacatcatgaagagtaccacgag 300 

241 CCTGAAATGTACAAGGAGGAAAAACAAAAACCCIGCAAACATCATGAAGAGTACCACGAG 300 

301 tcacgcgaatcgaaggagcacgaagagtacgataaagaaaaacccgatttccccaaatgg 360 

301 TCACGCGAATCGAAGGAGCACGAAGAGTACGATAAAGAAAAACCCGATTTCCCCAAATGG 360 

361 gaaaagcctaaagagcacgagaaacacgaagtcgaatatccgaaaatacccgagtacaag 420 

361 GAAAAGCCTAAAGAGCACGAGAAACACGAAGTCGAATATCCGAAAATACCCGAGTACAAG 420 

421 gacaaacaagatgagaataagaaacataaagatgaagagtgccaggagtcacacgaatcg 480 

421 GACAAACAAGATGAGAATAAGAAACATAAAGATGAAGAGTGCCAGGAGTCACACGAATCG 480 

481 aaagagcacgaagagtacgagaaagaaaaacccgatttccccaaatgggaaaagcctaaa 540 

481 AAAGAGCACGAAGAGTACGAGAAAGAAAAACCCGATTTCCCCAAATGGGAAAAGCCTAAA 540 

541 gggcacgagaaacataaagccgaatatccgaaaatacctgagtgcaaggaaaaactagat 600 

541 GGGCACGAGAAACATAAAGCCGAATATCCGAAAATACCTGAGTGCAAGGAAAAACTAGAT 600 

601 gaggataaggaacataaacatgagttcccaaagcatgaaaaagaagaggagaagaaacct 660 

601 GAGGATAAGGAACATAAACATGAGTTCCCAAAGCATGAAAAAGAAGAGGAGAAGAAACCT 660 

661 gagaaaggcatagtaccctgagtgggttaaaatgcctgaatggccgaagtccatgtttac 720 

661 GAGAAAGGCATAGTACCCTGAGTGGGTTAAAATGCCTGAATGGCCGAAGTCCATGTTTAC 720 

721 tcagtctggctcgagcactaagccttaagccatatgacactggtgcatgtgccatcatca 780 

721 TCAGTCTGGCTCGAGCACTAAGCCTTAAGCCATATGACACTGGTGCATGTGCCATCATCA 780 

781 tgcagtaatttcatgggatattgtaattatattgttaataaaaaagatggtgagtgggaa 840 

781 TGCAGTAATTTCATGGGATATTGTAATTATATTGTTAATAAAAAAGATGGTGAGTGGGAA 840 

841 atgtgtgtgtgcattcatccatgagcaatgctgaatctctttgcatgcatagagattctg 900 

841 ATGTGTGTGTGCATTCATCCATGAGCAATGCTGAATCTCTTTGCATGCATAGAGATTCTG 900 

901 aatggttatagtttatgttatatcgtttgttctagtgaaattaattttgaatgttgtatg 960 

901 AATGGTTATAGTTTATGTTATATCGTTTGTTCTAGTGAAATTAATTTTGAATGTTGTATG 960 
961 taatgtt 967 
961 TAATGTT 967 



RESULT 2 
T13048 

ID T13048 standard; cDNA; 1283 BP. 

AC T13048; 

DT 27-MAY-1996 (first entry) 

DE Cotton fibre-specific cDNA clone E9 . 

KW Cotton; fibre; promoter; transgenic plant; crop improvement; 

OS Gossypium hirsutum strain Coker 312. 

PN US5495070-A, 

PD 27-FEB-1996. 

PF 04-OCT-1988; 253243, 

PR 04-OCT-1988; OS-253243. 

PR 21-NOV-1990; OS-617239. 



PR 18-MAY-1992; CS-885970. 

PA (CETU ) AGRACETtJS INC. 

PI John M; 

DR WPI; 96-139095/14. 

PT New isolated fibre-specific promoters - used for introducing 

PT altered fibre-specific characteristics into plants, partic. cotton. 

PS Example 3; Column 45-46; 48pp; English. 

CC Cotton cDNA clone E9 (T13048) was isolated from a cDNA library of 

CC cotton var. Coker 312 15-day-old boll cells using a subtractive 

CC hybridization procedure. The clone hybridises strongly to fiber 

CC RNA and weakly to petal RA. E9 and other fibre-specific cDNA clones 

CC (see T13033-47 and T13049-T13050) were used to screen cotton genomic 

CC libraries/ leading to the isolation of genomic clones (see T13025-32 

CC and T13052-53) contg, sequences capable of promoting gene expression 

CC in fibre cells. 

SO Sequence 1283 BP; 509 A; 233 C; 251 G; 290 I; 

Query Match 75.5%; Score 730; DB 1; Length 1283; 

Best Local Similarity 82.5%; Pred, No. 2.2e-174; 

Matches 937; Conservative 0; Mismatches 30; Indels 169; Gaps 

Qy 1 ctttctatttggttaaccatggctcataactttcgtcatcctttcttccttttccaactt 60 

Db 28 CTTTCTATTTTGTAAACCATGGCTCATAACTTTTGTCATCCTTTCTTCCTTTTCCAACTT 87 

Qy 61 ttactcattactgtctcactaatgatcggtagccacaccgtctcgtcagcggctcgacat 120 

Db 88 TTACTCATTACTGTCTCACTAATAATCGGTAGTCACACCGTCTCGTCAGCGGCTCGACAT 147 

Qy 121 ttattccacacacaaacaacctcatcagagctgccacaattggcttcaaaatacgaaaag 180 

Db 148 TTATTCCAGACACAAACAACCTCATCAGAGCTGCCACAATTGGCTTCAAAATACGAAAAG 207 

Qy 181 cacgaagagtctgaatacaaacagccaaaatatcatgaagagtacccaaaacatgagaag 240 

Db 208 CACAAAGAGTCTGAATACAAACAACCAAAATATCACGAAAAGTACCCAAAACATGAGAAG 267 

Qy 241 cctgaaatgtacaaggaggaaaaacaaaaaccctgcaaacatcatgaagagtaccacgag 300 

Db 268 CCTAAAATGCACAAGGAGGAAAAACAAAAACCCTGCAAACATCATGAAGAGTACCACGAG 327 

Qy 301 tcacgcgaatcgaaggagcacgaagagtacgataaagaaaaacccgatttccccaaatgg 360 

Db 328 TCACGCGAATCGAAGGAGCACGAAGAGTACGATAAAGAAAAACCCGATTTCCCCAAATGG 387 

Qy 361 gaaaagcctaaagagcacgagaaacacgaagtcgaatatccgaaaatacccgagtacaag 420 

Db 388 GAAAAGCCTAAAGAGCACAAGAAACACGAAGTTGAATATCCGAAAATACCCGAGTACAAG 447 

Qy 421 gacaaacaagatg 433 

iiiimimii 

Db 448 GACAAACAAGATGAGGATAAGGAACATAAAAATGAAGAGTACCATGAATCACGCGAATCG 507 

Qy 434 433 

Db 508 AAGGAGCACGAAGAATACGAGAAAGAAAAACCCGAGTTCCCCAAATGGGAAAAGCCTAAA 567 

Qy 434 433 

Db 568 GAGCACGAGAAACACGAAGTCGAATATCCGAAAATACCCGAGTACAAGGAAAAGCAAGAT 627 
Qy 434 -agaataagaaacataaagatgaagagtgccaggagtcacacgaatcgaaagagcacgaa 492 
Db 628 AAGAGTAAGGAACATAAAGATGAAGAGTGCCACGAGTCACACGAATCGAAAGATCACGAA 687 
Qy 493 gagtacgagaaagaaaaacccgatttccccaaatgggaaaagcctaaagggcacgagaaa 552 
Db 688 GAGTACGAGAAAGAAAAACCCAATTTCTTCAAATGGGAAAAGCCTAAAGAGCACGAGAAA 747 
Qy 553 cataaagccgaatatccgaaaatacctgagtgcaaggaaaaactagatgaggataaggaa 612 
Db 748 CATAAAGCCGAATATCCAAAAATACCCGAGTGCAAGGAAAAACAAGATGAGGATAAGGAA 807 
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mi mi in i ii i inn inn m i i i 

Db 3514 AAAAAAAAAANAAAAAAAAAAAAAAMAAAAAAAAAANAAAAAAANAAAAAAAAAAANAA 3455 

Qy 288 agagtaccacgagtcacgcgaatcgaaggagcacgaagagtacgataaagaaaaacccga 347 

I I I II I II II I II I I III lllll I 

Db 3454 AMNAAAAAAAAAAAAAANAAAAAAAAAANNAAAAAAAAAAANANNAAAAAAAAAAAAAA 3395 

Qy 348 tttccccaaatgggaaaagcctaaagagcacgagaaacacgaagtcgaatatccgaaaat 407 

II I MM I I II I II llll 

Db 3394 AAAAAAAANAAAAANNNAANNNAAAAAAAAAAAAAANNNNAANAMANANAAAAAAAAAA 3335 

Qy 408 acccgagtacaaggacaaacaagatgagaataagaaacataaagatgaagagtgccagga 467 

I I II I II II I I II Mill I I I III II 

Db 3334 AAAAAAMNAAAAAAMAAAAAAAMAAAAAAAAAAAAAAANAANAAAAAAANAAAAAAA 3275 

Qy 468 gtcacacgaatcgaaagagcacgaagagtacgagaaagaaaaacccgatttccccaaatg 527 

I I II III I I II I I I III lllll I III 
Db 3274 AAAAAAAAAANAAAAMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAANNNNNAAAAAM 3215 

Qy 528 ggaaaagcctaaagggcacgagaaacataaagccgaatatccgaaaatacctgagtgcaa 587 

llll II I III I III II I llll I I II 
Db 3214 AAAAAAMANNAAAAAAAAANAAAAAAAAAAAAAAAAAAAAMAAAAAMAMAAAAAAA 3155 

Qy 588 ggaaaaactagatgaggataaggaacataaacatgagttcccaaagcatgaaaaagaaga 647 

lllll I I I II II III I III I III I II I 

Db 3154 AAAAAAANANNAAAAAAAAAAAAAAMAAAAAAMNAAAAAAAAAAAAAAAAANAAAAAA 3095 

Qy 648 gga'gaagaaacctgagaaaggcatagtaccctgagtgggttaaaatgcctgaatggccga 707 

I II III I I II I I I I II II 

Db 3094 AAMAAAAAMCAAAAAANNNAAANNNNAN1MANNNNNNNNAGNNNNATTTAT NACCCC 3035 

Qy 708 agtccatgtttactcagtctggctcgagcactaagccttaagccatat 755 

I I I I III llll I II III llll 
Db 3034 CCCCTGGTATGATTGCATCTTGCTCRNACCAAAAAAGTTATTAAATAT 2987 



Search completed: September 3, 2000, 02:53:40 
Job time: 31570 sec 
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RL DOUBUTSUYOU SEIBUTSUGAKUTEKI SEIZAI KYOKAI, KITASATO INST: THE, 
XX 

CC OS None 

CC OC Artificial sequences. 

CC PN JP 1995284392-A/2 

CC PD 31-OCT-1995 

CC PF 19-APR-1994 JP 1994080643 

CC PI DOI HIROHITO, NAGAKUCHI YOSHIO, TANAKA YOSHIO, PUJISAKI TUJIRO 

CC PC C12Nl5/09,A6lK39/015 / C12P21/02; 

CC CC strandedness: Double; 

CC CC topology: Linear; 



CDS 



CC PH Key 

CC FH 

CC FT 

CC FT 

CC FT 

CC FT 

CC FT 

CC FT 

CC FT miscjeature 

CC FT 

CC FT miscjeature 

CC FT 

CC FT miscjeature 

CC FT 

CC FT miscjeature 

CC FT 

CC FT miscjeature 

CC FT 

XX 



Location/Qualifiers 
1. .3399 

/organism-'Artificial sequences" 
1. .3399 

/product- "fusion protein of maltose-binding 
protein and an 
immunogenicity protein" 
1. .1149 

/note- "maltose-binding protein" 
1150. ,1174 . 
/note-'EcoRI adaptor" 
1174. .3195 

/note-'imunogenicity protein" 
3194. .3218 
/note-'EcoRI adaptor" 
3219. .3399 

/note-'sequence derived from pMAL-c vector" 



Key 



FH 
FH 

ft source 



Location/Qualifiers 
1, .3399 

/dbjcref-"taxon:32644" 
/organism- "unidentified" 



Sequence 3399 BP; 1577 A; 509 C; 797 6; 516 T; 0 other; 



Query Match 9.1%; Score 88.4; DB 23; Length 3399; 

Best Local Similarity 48.84; Pred. no. l.le-08; 

Matches 239; Conservative 0; Mismatches 251; Indels 0; Gaps 

Qy 168 aaaatacgaaaagcacgaagagtctgaatacaaacagccaaaatatcatgaagagtaccc 227 

I! I Mil I I III I III II I I II I I Mill I 
Db 2559 AACACATGAAGAAGAAGAAAAAGTAACATATGAAGAAGAAGAAGAAGAAGAAGAAAAAGT 2618 

Qy 228 aaaacatgagaagcctgaaatgtacaaggaggaaaaacaaaaaccctgcaaacatcatga 287 

ii nun i mi i i mi ii ii m i nil i n 

Db 2619 AACACATGAAGAAGAAGAAAAAGTAACACATGAAGAAGAAGAAAAAGTAACACATGAAGA 2678 

Qy 288 agagtaccacgagtcacgcgaatcgaaggagcacgaagagtacgataaagaaaaacccga 347 

III I I I II III I II I Mill I II Mill II II 
Db 2679 AGAAGAAAAAGTAATACATGAAGAAGAAGAAAAAGAAGAAGATGAGGAAGAAGAAGAAGA 2738 

Qy 348 tttccccaaatgggaaaagcctaaagagcacgagaaacacgaagtcgaatatccgaaaat 407 

ii ii i mi i ii i i mi iii i ii 

Db 2739 AGAAGAAGAAGAAGAGGAAGAAGAAGAAGAAGAAGATGAGGAAGAGGAAGAAGAAGAAGA 2798 

Qy 408 acccgagtacaaggacaaacaagatgagaataagaaacataaagatgaagagtgccagga 467 

I II I I II II MM II I II I I MM Mill I I 
Db 2799 AGAAGATGAGGAAGAGGAAGAAGAAGAAGAAAATGAGGAAGAAGAAGAAGAAGAAAATAA 2858 

Qy 468 gtcacacgaatcgaaagagcacgaagagtacgagaaagaaaaacccgatttccccaaatg 527 

I I I III Mil I MMII I II lllll I I II -J 
Db 2859 GGAAGAAGAAGAAGAAGAAAAAGAAGAGCATGAAGAAGAAGTAACACATGAAGAAGAAGA 2918 

Qy 528 ggaaaagcctaaagggcacgagaaacataaagccgaatatccgaaaatacctgagtgcaa 587 

lllll II I II II I MM Ml II I ' I I 
Db 2919 AGAAAAAGTAACACATGAAGAAGAAGAAAAAGTAACACATGAAGAAGAAGAAAATGTAAC 2978 

Qy 588 ggaaaaactagatgaggataaggaacataaacatgagttcccaaagcatgaaaaagaaga 647 

, i ii inn ii ii ii i ii i i ii Minn 



Db 2979 ATATGAAGAAGAAGAAGAAAAAGTAACACATGAAGAAGAAGAAAAAGTAACACATGAAGA 3038 

Qy 648 ggagaagaaa 657 

II I III 
Db 3039 AGAAGAAAAA 3048 



RESULT 15 
AC013349/C 
LOCUS 

DEFINITION 
ACCESSION 



KEYWORDS 
SOURCE 
ORGANISM 



AUTHORS 

TITLE 

JOURNAL 



TITLE 
JOURNAL 



AC013349 129404 bp DNA HTG 06-FEB-2000 

Homo sapiens Clone RP11-22K1, LOW-PASS SEQUENCE SAMPLING. 
AC013349 

AC013349.2 GI:6910730 
HTG; HTGS.PHASE0. 
human. 

Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Mammalia; 
Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 129404) 

Birren,B., Linton, L., Nusbaum,C. and Lander, E. 
Homo sapiens, clone RP11-22K1 
Unpublished 

2 (bases 1 to 129404) 

Birren,B,, Linton, L., Nusbaum,C,, Lander, E., Allen, N., Anderson, M,, 
Baldwin, J., Barna,N., Beckerly,R., Boguslavkiy,L., Boukhgalter,B., 
Brown, A., Castle, A., Colangelo,M,, Collins, S., Collymore,A., 
Cooke, P., DeArellano,K., Dewar,K., Domino ,M., Donelan,L,, Doyle,M., 
Ferreira,P., FitzHugh,w., Forrest, C, Funke,R., Gage,D., 
Galagan,J., Gardyna,S., Grant, G., Hagos,B., Heaford,A., Horton,L., 
Howland,J.C, Johnson, R., Jones, C, Kann,L,, Karatas,A., Klein, J. , 
Lehoczky,J,, Lieu,C, Locke, K,, Macdonald,P., Marquis, N,, 
McEwan,P,, McGurk,A., McKernan,K., McLaughlin, J,, Meldrim, J. , 
Morrow,J., Naylor,J., Norman, C.H., O'Connor,!,, 0'Donnell,P., 
Peterson, K., Pollara,V,, Riley,R., Roy, A., Santos, R., Severy,P., 
Stange-Thomann,N., StojanovicN. , Subramanian,A., Talamas,J., 
Tesfaye,S., Tirrell,A., Vassiliev,H., Vo,A., Wheeler,j., Wu,x., 
Wynan,D., Ye,w.j., zimmer,A. and Zody,M. 
Direct Submission 

Submitted (06-NOV-1999) Whitehead Institute/MIT Center for Genome 

Research, 320 Charles Street, Cambridge, MA 02141, USA 

On Feb 6, 2000 this sequence version replaced gi : 6272406 . ' 

All repeats were identified using RepeatMasker : 

Smit, A.F.A. s Green, P. (1996-1997) 

http : //ftp , genome , Washington . edu/RM/RepeatMasker , html 

Genome Center 

Center: Whitehead Institute/ MIT Center for Genome Research 

Center code: WIBR 

Web site: http://www-seq.wi, rait.edu 

Contact : sequencer ubmiss ions Ggenome . wi . mit . edu 

Project Information 

Center project name: L4134 
Center clone name: 22_K_1 



NOTE: This record contains 151 individual 
sequencing reads that have not been assembled into 
contigs, Runs of N are used to separate the reads 
and the order in which they appear is completely 
arbitrary. Low-pass sequence sampling is useful for 
identifying clones that may be gene-rich and allows 
overlap relationships among clones to be deduced, 
However, it should not be assumed that this clone 
will be sequenced to completion. In the event that 
the record is updated, the accession number will 
be preserved. 

1 930: contig of 930 bp in length 
gap of unknown length 
931 1855: contig of 925 bp in length 

gap of unknown length 
1856 2759: contig of 904 bp in length 

gap of unknown length 
2760 3706; contig of 947 bp in length 

gap of unknown length 
3707 4566: contig of 860 bp in length 
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/dbjcref-"GI: 1255419" 

/translation-'MLCQVCGAEGPEPHFGGISCRACAAFFRRYVHSRKLDISCTCKH 
RLATSHPCRHCRMLKCMATGMVKCKVQGSREKNKITTSSLPGHISSISLLSARIVPRD 
CSNISCTVSKWTKVEKMRKDLYGEKICEINFTQFSSFVKRDTHLLWDLGEKIFTDVKL 
LSEADKHSILCNFFPRWMMLDSAVAICPDYEESKAYIKSKDYTDMLLHFTGSSMPKEK 
RLKDHEILKIFKPYWDFHYYETAVPIHFKKLDKIEYMAIFLLLLFDDAYTNISEEGVK 
LCQNVRKWQRELKGYQTDSNCDEMRFVETMDTLLLLEKAEEKIQEEVLICGFNNVTL 
HEDFRTIFQVKKL" 
gene complement (15326. .17135) 

/gene-'C33G8.11" 

■ CDS complement! join(15326. .15621,15852. .16191,16239. .16466, 

16545. .16636,16682. .16747,16994, ,17135)) 
/gene-"C33G8,U" 

/note- "similar to steroid/thyro id/ret i no i c nuclear hormone 

receptors" 

/codon_start-l 

/evidence-not_experimental 

/protein_id-"AAC25855.1" 

/dbjcref-'GI: 3294498" 

/translation-'MVNCMVCDASSAQYHFGAIACRACAAFFRRYVNSKKLTILCKCL 
SKKESQYPCRLCRMKKCKAVGMEASKVQGPRDVNNPFKIKMIEGSSSPESTLSSIEYG 
LQPRDSELIKLIIKNYKNLEMTREAIYNMTPSTTGTMQVNLYELSLEVKTDSKLML 
CEDTFPEFNQLCKMDKRILFNNFYSKWSILEVTMLATKYNDSKNFYSPSGAMCTSINE 
FYVNTVRDNSAISQEDI IRYLATRWIFNLWLRVFEPMYHFEQVIDPFLALKLNDM 
ENMALFGIIFWDGSYTNISDELSELCHSMRKIICRELSAHFNETCTTSSRFFETLDTL 
NnEVSLAMTTRSQCKLLLQKAERKCQEEIALCGFYNFEVDDDMKNMIMWEKY" 
gene 18885. .20672 

/gene="nhr-42" 

CDS join(18885. .18967,19407. .19725,19771. .19896,19988, .20051, 

20100. .20211,20260. .20375,20422. ,20672) 
/gene-"nhr-42" 

/note-"C33G8.6; coded for by C, elegans cDNA yk482fll.5; 
coded for by C. elegans cDNA yk482fll.3" 
/codon_start-l • 

/product-'similar to steroid/thyroid/retinoic nuclear 
hormone receptors" 
/protein_id-"AAC25857.1" 
/dbjcref-"GI: 3294500" 

/translation-'MTRQTTSQTCLICGDSADSLHFGALSCRACAAFFRRKVAGRRNI 
FRRCDRQCKVDTGMRKLCASCRYDKCLKVGMRESAVLSRLAKRNQNYKKSIVGSPDAY 
EPSTSTSDSVLENIflSAYHKLEETRKRVFNISETHVSQCCNYKRMNDVFFEDIKLVME 

Query Match 9.24; score 89; DB 34; Length 39103; 

Best Local Similarity 48.7%; Pred. No. 6.2e-09; 

Matches 242; Conservative 0; Mismatches 255; Indels 0; Gaps 0; 

Qy 169 aaatacgaaaagcacgaagagtctgaatacaaacagccaaaatatcatgaagagtaccca 228 

I! I III I I II III II I III I III I I II III I 
Db 31431 AAGGAAGAAGATGATGAGGAGGAAGAGCAAAAAGATAAGAAAAAAAAAGACGAGAAGAAT 31490 

Qy 229 aaacatgagaagcctgaaatgtacaaggaggaaaaacaaaaaccctgcaaacatcatgaa 288 

I Mil I III I III I II II- I III I II Mil 
Db 31491 GACGATGATGATGAAGAAGACAAGAAGAAAGACAAGAAGAAAAAGAAGGATGATAATGAT 31550 

Qy 289 gagtaccacgagtcacgcgaatcgaaggagcacgaagagtacgataaagaaaaacccgat 348 

III I II II III II I II II I I I II I Mil 
Db 31551 GAGGAGGAAAAAGAGAAAGATAAGAAAAAGAAGGACAAGAAAAAGGACGATGATGACGAT 31610 

Qy 349 ttccccaaatgggaaaagcctaaagagcacgagaaacacgaagtcgaatatccgaaaata 408 

II II III IMI I I Ml I I I II I I 
Db 31611 GAAGATGAAAAIGATAAGAAAAAGGATAAGAAAAAAAAGAAGGAIGACAAGGATGATGAT 31670 

Qy 409 cccgagtacaaggacaaacaagatgagaataagaaacataaagatgaagagtgccaggag 468 

I I I MM I Ml II MUM I II Mill II Mil 
Db 31671 GAGAATGAGGATGACAAAAAGAAGGACAAAAAGAAAAAGAAGGATGATAAGGAAAAGGAT 31730 

Qy 469 tcacacgaatcgaaagagcacgaagagtacgagaaagaaaaacccgatttccccaaatgg 528 

I II I Mill I II II I lllllll III I II 
Db 31731 GATGATGATGAGGAAGAGAAGGATAAGAAAAAGAAAGACAAAAAGAAGAATGACGACGAC 31790 

Qy 529 gaaaagcctaaagggcacgagaaacataaagccgaatatccgaaaatacctgagtgcaag 588 

III I MM I I II II Ml II I MM II 

Db 3179.1 GAAGAAGATAAAAAGAAGGACAAGAAAAAGAAAAAGGACGATGATGACGATGAGGATGAG 31850 



Qy 589 gaaaaactagatgaggataaggaacataaacatgagttcccaaagcatgaaaaagaagag 648 

IMI I IIIIIMI II III I I! I MM I II II 
Db 31851 GATAATAAGAAAAAGGATAAGAAAAAGAAGAAGGATGATAAGGATGATGAAGATGAGGAA 31910 

Qy 649 gagaagaaacctgagaa 665 

II Ml IMI 
Db 31911 AAGGAGAAGGACAAGAA 31927 



RESULT 12 
AC006884/C 
LOCUS 

DEFINITION 
ACCESSION 



AUTHORS 

TITLE 

JOURNAL 



TITLE 
JOURNAL 



AC006884 193188 bp DNA HTG 26-FEB-1999 

Caenorhabditis elegans clone Y57E12, *** SEQUENCING IN PROGRESS 
***, 4 unordered pieces, 
AC006884 

AC006884.2 GI:4309911 
HTG; HTGSJHASEl, 
Caenorhabditis elegans . 
Caenorhabditis elegans 

Eukaryota; Metazoa; Nematoda; Secernentea; Rhabditia; Rhabditida; 
Rhabditina; Rhabditoidea; Rhabditidae; Peloderinae; Caenorhabditis. 

1 (bases 1 to 193188) 
Waterston,R.H. 

The sequence of Caehorhabditis elegans clone 
Unpublished 

2 (bases 1 to 193188) 
Waterston,R.H, 

Direct Submission _^ 
Submitted (24-FEB-1999) Genome Sequencing Center, Washington 
University School of Medicine, 4444 Forest Park Parkway, St. Louis, 
MO 63108, USA 

On Mar 1, 1999 this sequence version replaced gi:4263464. 

* NOTE: This is a 'working draft' sequence, It currently 

* consists of 4 contigs, The true order of the pieces 

* is not known and their order in this sequence record is 

* arbitrary. Gaps between the contigs are represented as^ 

* runs of N, but the exact sizes of the gaps are unknown?* 

* This record will be updated with the finished sequence 

* as soon as it is available and the accession number will 

* be preserved. 

* 1 2784: contig of 2784 bp in length 

* 2785 2799: gap of unknown length 
2800 ■ 37804: contig of 35005 bp in length 

37819: gap of unknown length 
74165: contig of 36346 bp in length 
74180: gap of unknown length 
193188: contig of 119008 bp in length. 
Location/Qualifiers 
1. .193188 

/organism- "Caenorhabditis elegans" 
/db_xref-"taxon:6239" 
/clone-"Y57E12" 
63610 a 33574 c 33884 g 62075 t 45 others 



Query Match 9.2*; Score 89; DB 41; Length 193188; 

Best Local Similarity 48.7%; Pred, No, 5e-09; 

Matches 242; Conservative 0; Mismatches 255; Indels 0; Gaps 0; 

Qy 169 aaatacgaaaagcacgaagagtctgaatacaaacagccaaaatatcatgaagagtaccca 228 

II MM I I II III II I Ml I III I I II III I 
Db 125873 AAGGAAGAAGATGATGAGGAGGAAGAGCAAAAAGATAAGAAAAAAAAAGACGAGAAGAAT 125814 

Qy 229 aaacatgagaagcctgaaatgtacaaggaggaaaaacaaaaaccctgcaaacatcatgaa 288 

I IMI I III I III Ml II I III I II Ml 
Db 125813 GACGATGATGATGAAGAAGACAAGAAGAAAGACAAGAAGAAAAAGAAGGATGATAATGAT 125754 

Qy 289 gagtaccacgagtcacgcgaatcgaaggagcacgaagagtacgataaagaaaaacccgat 348 

Mill II III II III Ml I I II I Ml 
Db 125753 GAGGAGGAAAAAGAGAAAGATAAGAAAAAGAAGGACAAGAAAAAGGACGATGATGACGAT 125694 



37805 
37820 
74166 
74181 



FEATURES 
source 
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VCNSPEKLQETASWLLAQOLGDGSFHDPCPVIHRAMQGGLVGSDETVALTAFWIALH 

HGLDVFQDDDAKQLKNRVEASITKANSPLGQRASAGLLGAHAAAITAYALTLTKASED 

LRNVAHNSLMAMAEETGEHLYWGLVLGSQDKWLRPTAPRSPTEPVPQAPALWIETTA 

YALLHLLLREGKGKMADRAASWLTHQGSFHGAPRSTQDTWTLDALSAYWIASHTTEE 

KALNVTLSSMGRNGLKTHGLHLNNHQVKGLEEELKFSLGSTISVKVEGNSKGTLKILR 

TYNVLDMKNTTCQDLQIEVKWGAVEYAWDANEDTEDYYDMPAADDPSVPLQPVTPLQ 

LFEGRRSRRRREAPKWEEQESRVQYTVCIWRNGKLGLSGMAIADITLLSGFHALRAD 

LERLTSLSDRYVSHFETDGPHVLLYFDSVPTTRECVGFGASQEVWGLVQPSSAVLYD 

YYSPDHKCSVFYAAPTKSQLLATLCSGDVCQCAEGKCPRLLRSLERRVEDKDGYRKRF 

TCYYPRVEYGFTVKVLREDGRAAFRLFESKITQVLHFRKDTMASIGQTRNFLSRASCR 

LRLEPNKEYLIMGMDGETSDNKGDPQYLLDSNTWIEEMPSEOMCKSTRHRAACFQLKD 

FLMEFSSRGCQV" 
repeatjregion complement (6866. .7037) 

/rpUamily"(JRRlA" 
repeat_region 9638. .9759 

/rpt_family"PBlD7" 
repeat_region 9678. .9759 

/rpt_family*PBlD9" 
repeat_region 11554. .11584 

/rpt_family-"(CA)n" 
repeatjregion 11585. .11981 

/rpt_family"MTA" 
repeat_region 11979. .12096 

/rpt_family-"(CA)n" 
repeatjregion 12463. .12505 

/rpt_family-"(GGA)n" 
repeatjregion 12506. .12621 

/rpUamily-"(GGGA)n" 
repeat_region 12609. .12727 

/rpt_family-"(GAA)n" 
repeatjregion 12735. .12782 

/rpt_family-"(GGA)n" 
repeatjregion 12797. .12916 

' /rpt_family-"(GGA)n" 
repeatjregion 12845. .12962 

/rpt_family-"(GGAA)n" 
repeatjregion 12982. .13104 

/rpt_family"(GGA)n" 
repeatjregion 13141. .13259 

/rpt_family-"(GGA)n" 
repeatjregion 13262. .13318 

/rpt_family-"(GGA)n' 
repeatjregion 13551. .13820 

/rptJamily- , B4A" 
repeatjregion 14019. .14137 

/rpt_family"(CA)n" 
repeatjregion 15814. .16084 

/rpt.family-"B4" 
repeatjregion 16086. .16200 

/rptJamily'Bl-F" 
repeatjregion 17661. .17775 

/rptJamily-"PBlD7" 
repeatjregion 17816. .17902 

/rptJamily"PBlD7" 
repeatjregion complement (18595. .18693) 

/rpt_family"(CA)n" 
repeat_region complement (20122. ,20306) 

/rptj?amily-"B3" 
miscjeature 20304. .22376 

/note-'this span duplicates the region from 105857-107756; 

it contains a copy of exons 3 and 4 of the Gil gene" 
repeatjregion complement (20314. .20457) 

/rptJamily-'BljMM' 
repeat jregion complement (20461. .20663) 

/rpt_family"B4A" 
repeatjregion complement(21345.' .21431) 

/rpt_family"(GGAA)n" 
repeat_region 21913. .22022 

/rpt_family"PBlD9" 
repeatjregion 22126. .22222 

/rptJamily-'PBlDlO" 
repeatjregion complement(22379. .22721) 

/rptJamily^ORRlAS" 



repeatjregion 
repeatjregion 
repeatjregion 



repeat_region 
repeatjregion 
repeatjregion 
repeatjregion 



22788. .22924 
/rpt family "B3* 
22938. .23864 
/rpt_family"RLTR13C" 
complement{23896. .23934) 
/rpt family" (GAAAA)n" 
25028. .26295 
/rpt_family"MMETN" 
26296. .26443 
/rpt fanily'BljM" 
26445. .26523 
/rpt_family"POLYjV 
complement 26524. .26782) 
/rpt_famlly"RLTRl" 
complement} 26827. .26953) 
/rpt_family"RLTRl" 



Query Match 9.2%; Score 89.4; DB 12; Length 149886; 

Best Local Similarity 50.1%; Pred. No. 4.3e-09; 

Matches 249; Conservative 0; Mismatches 246; Indels 2; Gaps 1; 

Qy 173 acgaaaagcacgaagagtctgaatacaaacagccaaaatatcatgaagagtacccaaaac 232 

I III II I III HIM I II III I II III I III 
Db 12625 AGGAATAGAAGAAGGAAGAAGAATAGGAGAAGGAGGAAGAAGAGGAGGAGGATGAGAAAG 12684 

Qy 233 atgagaagcctgaaatgtacaaggaggaaaaacaaaaaccctgcaaacatcatgaagagt 292 

I II II II I I Mill II I I II II I I II III 

Db 12685 AGGAAAAAAAGGAGGAGGAAGAGGAGAAAGAGGAGAAGGAAGAGAAGTTTGAAGAGGAGG 12744 

Qy 293 accacgagtcacgcgaatcgaaggagcacgaagagtacgataaagaaaaacccgatttcc 352 

II II I II I II I I llllll I II II II II . 

Db 12745 AGGAGAAGGAAAAGGAGGTGGAGAATGAGGAAGAGGAGGAAAAGGAGGGGAAGGAAGAGG 12804 

Qy 353 ccaaatgggaaaagcctaaagagcacgagaaacacgaagtcgaatatccgaaaatacccg 412 

I llllll II III I III II I II I II I III I I 

Db 12805 AGGAGGAAGAAAAGAAAAAGGAGGAGGAGGAATAGGA1GAGGAGGAAGAGAAGAAGGAGG 12864 

Qy 413 agtacaaggacaaacaagatgagaataagaaacataaagatgaagagtgccaggagtcac 472 

II I I INI Ml III III I I I II llllll I III . 

Db 12865 AGGAGGAAGAAAAGGAGGAAGAGGGGAAGGAGGAGGAGGAGGAAGAGGAGGAAGAGAATA" 12924 

Qy 473 acgaatcg-aaagagcacgaagagtacgagaaagaaaaacccgatttccccaaatggga 530 

I III I Mill I Mil I Mil III I II I III 
Db 12925 AGGAAGAGGAGAAGAGAAGGGAGAAAAGGAGAAGGAAGAGGAGGAGGAAGAGGACAAGGA 12984 

Qy 531 aaagcctaaagggcacgagaaacataaagccgaatatccgaaaatacctgagtgcaagga 590 

II III I I II III I II I III II Ml 

Db 12985 GGAGGAGAAGGAGGAGGATGAAGAGGACAAGGAGGAGGAGAAGGAGGAGGATGAAGAGGA 13044 

Qy 591 " aaaactagatgaggataaggaacataaacatgagttcccaaagcatgaaaaagaagagga 650 

III I II Mil Ml I I III III I II II III Ml 

Db 13045 CAAAGTGGAGGAGGAAGAGGAGAAGTTTGAGGAGGAGGAGAAGGAGGAGAAGGAAAAGGA 13104 

Qy 651 gaagaaacctgagaaag 667 

I II I III II 
Db 13105 GGAGGAGAAGGAGGAGG 13121 



RESULT 11 
CELC33G8 

LOCUS CELC33G8 39103 bp DNA INV 08-JUL-1998 

DEFINITION Caenorhabditis elegans cosmid C33G8. 

ACCESSION U53154 

VERSION U53154.1 GI:1255414 

KEYWORDS . 

SOURCE Caenorhabditis elegans. 
ORGANISM Caenorhabditis elegans 

Eukaryota; Metazoa; Nematoda; Secernentea; Rhabditia; Rhabditida; 
Rhabditina; Rhabditoidea; Rhabditidae; Peloderinae; Caenorhabditis, 
1 (bases 1 to 39103) 

Wilson, R,, Ainscough,R., Anderson ,K., Baynes,C, Berks ,M., 
Bonf ield, J. , Burton, J., Connell,M., Copsey,T., Cooper, J,, 
Coulson,A., Craxton,M. ( Dear,S., Du,Z., Durbin,R., Favello,A., 
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/dbjtref""taxon:9606" 
/clone-"84_E_24" 


repeatjregion 


21957. ,22257 




/rptJamily-'AluSx" 




/clone Jib- "Alan Buckler ■■ per comm" 


repeatjregion 


complement 22438. .22538) 
/rpt family-"MLTU" 




/map- "17" 






/chromosome- "17" 


repeatjregion 


complement 23375, .23667) 


repeatjregion 


446. .753 




/rpt family-'AM" 




/rptJamily'AluSc" 


repeatjregion 


24553. ,24858 


repeatjregion 


2375, .2419 


/rptJamily'AluY" 




/rpt_f amily- " { TTTTG ) n " 


repeatjregion 


complement! 25021. ,25234) 
/rptJamily'MlR" 


repeat_region 


complement (2985, .3103) 
/rpt_family-"L2" 






repeatjregion 


26957. ,27261 


repeat_region 


complement(3146. .3352) 
/rpt_family-"L2" 




/rpt_f amily- "AM" 




repeatjregion 


complement (27524. .27669) 
/rpt.family"MLTlB" 


repeat_region 


complement(3531. ,3639) 






/rpt_family-"L2" 


repeatjregion 


complement 27670. .27964) 
/rpt_f amily- •AluSp ' 


repeatjregion 


complement(3798. .4394) 
/rpt_family-"L2" 






repeatjregion 


complement 27965. .28210) 
/rpt_family-"MLTlB" 


repeatjregion 


5531, .5631 






/rpt_family-"MER81" 


repeatjregion 


complement! 28228. .28522) 
/rpt_family"AluSg" 


repeat_region 


5648. .5674 






/rpt_family " (CAAAAA)n" 


repeatjregion 


complement (28592, .28700) 
/rpt.family-"MIR" 


repeat_region 


5812. .6020 






/rpt_family-*LlMC4" 


repeatjregion 


complement 29353. .29537) 
/rpt family "MIR" 


repeat_region 


6180, .6224 






/rptJamily-'ATjrich" 


repeatjregion 


29802. .30095 


repeat_region 


6303. .6611 




/rpt_family-"AluSg" 




/rptJamily'AluSp" 


repeatjregion 


complement 30099. .30219) 
/rpt family "MIR" 


repeat_region 


7378. .7618 






/rpt_f amily- "MIR" 


repeatjregion 


30227. .30267 ■ 


repeatjregion 


9593. .9623 




/rptJamily"(TG)n" 




/rptjf amily-" (GGGA)n" 


repeatjregion 


30276. .30321 


repeat_region 


complement (10666. .10923) 




/rpt_family"(CA)n" 




/rpt_f amily "AluSq" 


repeatjregion 


complement 30322. .30611) 
/rpt_family"AluJo" 


repeat_region 


10958. .11088 






/rpt_f amily- "MIR" 


repeatjregion 


complement 30956. .31151) 


repeat_region 


11752, .11914 




/rpt_family-"MER3" 




/rpt_family-"L2" 


repeatjregion 


complement 31533. .31633) 
/rpt_family-"L2" 


repeat_region 


complement(11915. .12219) 
/rptJamily-'AluSx" 






repeatjregion 


31810. .31858 


repeat_region 


12220. .12347 




/rpt_family"L2" 




/rpt_family"L2" 


repeatjregion 


32028, ,32107 


repeatjregion 


13110, .13182 




/rpt_f amily "L1PA4" 




/rpt_family-"GA-rich" 


repeatjegion 


complement( 32919. .33182) 
/rpt_family"MIR" 


repeatjregion 


13526. .13564 






/rpt_f amily "(TCC)n' 


repeatjregion 


34169. .34314 


repeatjregion 


13575. .13660 




/rpt_family"CT-rich" 




/rpt_f amily "(TA)n' 


repeatjregion 


34774. .35007 


repeatjregion 


complement(14380. .14658) 
/rpt_family"AluJo" 




/rpt_family*MIR" 




repeatjregion 


35589. .35621 


repeatjregion 


14742. .14830 




/rptJamily'ATj-ich" 




/rptJamily-"(TA)n" 


repeatjregion 


35763. ,36142 


repeatjregion 


14860. .14967 




/rpt_family-"MER41A" 




/rpt_famlly-'{TA)n" 


repeatjregion 


36143. ,36452 


repeatjregion 


complement(14977, ,15122) 
/rpt_family-"L2" 




/rptJamily'AluSg" 




repeatjregion 


36453. ,36636 


repeatjregion 


complement(15292. .15385) 




/rpt_family"MER41A" 




/rpt_family-"L2" 


repeatj-egion 


38011. .38071 


repeatjregion 


16672. .16781 




/rptJamily'CT-rich" 




/rpt_family-"L2" 


repeatjregion 


38083. .38262 


repeatjregion 


16802. .16846 




/rpt_family"(TTC)n" 




/rpt_family"(TGAA)n" 


repeatjregion 


38235. .38319 


repeatjregion 


17225. .17245 




/rpt_family"(TTC)n" 




/rptJamily-"(A)n" 


repeatjregion 


38320. .38499 


repeatjregion 


18411. ,18720 

/mf fAifHlvB*fcln.TM 
/ ipu^icuuiij" nlUUU 




/rpt_family"(TTC)n" 


repeatjregion 


complement(20715. .20788) 
/rpUamily"MLTU2" 


Query Match 


9.4%; Score 91; DB 39; Length 180385; 




Best Local Similarity 48.9%; Pred. No. 2e-09; 


repeatjregion 


complement(20822. .20916) 
/rpt_family«MLTU2" 


Matches 244; Conservative 0; Mismatches 255; Indels 0; Gaps 


repeatj*egion 






20956. .21137 


Oy 169 aaatacgaaaagcacgaagagtctgaatacaaacagccaaaatatcatgaagagtaccca 228 




/rpt family- "MIR" 


II 1 llllll 1 II III II 1 II II III 1 1 II II 1 


repeatjegion 


21830. .21872 


Db 38579 AAGAAGGAAAAGGAGGAGGAGGAGGAGAAGAAGAAGGAAAAGGAGGAGGAGAAGAAGAAG 38520 




/rpt.family-"AIjrich" 
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/db_xref - " taxon : 3 2 644 " 
/organism-'unidentified" 

Sequence 1686 BP; 915 A; 129 C; 382 G; 260 T; 0 other; 



Query Match 11.3%; 
Best Local Similarity 51.0%; 
Matches 259; Conservative 



Score 109.6; DB 23; Length 1686; 
Pred. No. 5.7e-13; 

0; Mismatches 249; Indels 0; ( 



168 aaaatacgaaaagcacgaagagtctgaatacaaacagccaaaatatcatgaagagtaccc 227 

I II I II I II III III I II I I I I I I III I I 

156 AGAAGAACAAGAACAAGAAATCGTAGAAGAACAAGAACAAGATGAAGAAGAACAAGAAGA 215 

228 aaaacatgagaagcctgaaatgtacaaggaggaaaaacaaaaaccctgcaaacatcatga 287 

II llll I III I II II III II II II II I I II 
216 GGAAGATGAAGAAGAAGAAGAAGAAAAAGAAGAAGAAGAAGAAGAAGAAGAAGAAGAAGA 275 

2 8 8 agagtaccacgagtcacgcgaatcgaaggagcacgaagagtacgataaagaaaaacccga 3 4 7 

III I I II I III I I II I III! I II lllll III II 
276 AGMGAAGAAGMGAAGAAGAAGAGCAAGATGAAGAAGAAGAAGAAGAAGAAGAACAAGA 335 

348 tttccccaaatgggaaaagcctaaagagcacgagaaacacgaagtcgaatatccgaaaat 407 

I II III I I II I II II I llll III I II 
336 TGAAGATGAAGAAGAAGAAGAAGATGAAGAAGAAGAAGAAGAAGAAGAACAAAATGAAGA 395 

408 acccgagtacaaggacaaacaagatgagaataagaaacataaagatgaagagtgccagga 4 67 

I lllll lllll llll II I II II llll lllll I II 
396 AGAACAAAATGAAGATGAACAAAATGAAGATGAACAAAATGAAGAAGAAGAAGAAGAAGA 455 

468 gtcacacgaatcgaaagagcacgaagagtacgagaaagaaaaacccgatttccccaaatg 527 

I I III llll II II II I II llll II II II 
456 AGAAGAAGAACAACAAGAACAAGATGAAGAAGAACAAGATGAAGAAGAACAAGATGAAGA 515 

528 ggaaaagcctaaagggcacgagaaacataaagccgaatatccgaaaatacctgagtgcaa 587 

III I III II II llll II III I II II II II 
516 AGAAGAAGAAGAAGAACAGGAAGAACAAGATGAAGAACAAGAAGAAGTATATGCTGAAAA 575 

588 ggaaaaactagatgaggataaggaacataaacatgagttcccaaagcatgaaaaagaaga 647 

lllll llllll II II II I III I II III HUH I I 
576 AGAAAATGAAGATGAAGAAAAAAAAGAAAAAGAAGAAGAACAAGAAGATGAAAAAATATA 635 

648 ggagaagaaacctgagaaaggcatagta 675 

I I III III llll 
636 TGTTGAAAAAGAAAAAGATGAAGAAGTA 663 



RESULT 7 
I66494/C 

LOCUS 166494 7218 bp DNA PAT 28 

DEFINITION Sequence 14 from patent US 5670367. 

ACCESSION 166494 

VERSION 166494.1 61:2724471 

KEYWORDS 

SOURCE Unknown . ■ 
ORGANISM Unknown. 

Unclassified. 
1 (bases 1 to 7218) 

Dorner,F,, Scheiflinger,F, and Falkner,F,Gunter, 
TITLE Recombinant fowlpox virus 
JOURNAL Patent; US 5670367-A 14 23-SEP-1997; 
FEATURES . Location/Qualifiers 
source 1. .7218 

/organism-'unknown" 
BASE COUNT 1944 a 1491 c 1486 g 1929 t 368 others 
ORIGIN 



Query Match 10,2%; Score 99; DB 5; Length 7218; 

Best Local Similarity 8.2%; Pred. No. 6.9e-ll; 

Matches 36; Conservative 255; Mismatches 150; Indels 0; Gaps 

Qy . 263. aacaaaaaccctgcaaacatcatgaagagtaccacgagtcacgcgaatcgaaggagcacg 322 



Db 1507 

Qy 323 

Db 1447 

Qy 383 

Db 1387 

Qy 443 



II III I II I lllll I llll III II III I 

AAAAAACGGCATGTAGGCATCACTGTAATTACCTATCTATGCAAGTAGTTAAAGAGATAG 



aagagtacgataaagaaaaacccgatttccccaaatgggaaaagcctaaagagcacgaga 

llll I I II ::::::: :: ::: :: : :::::: : :::: 



aacacgaagtcgaatatccgaaaatacccgagtacaaggacaaacaagatgagaataaga 



aacataaagatgaagagtgccaggagtcacacgaatcgaaagagcacgaagagtacgaga 



aatatccgaaaatacctgagtgcaaggaaaaactagatgaggataaggaacataaacatg 



Db 1327 

Qy 503 aagaaaaacccgatttccccaaatgggaaaagcctaaagggcacgagaaacataaagccg 

Db 1267 
Qy 563 

Db 1207 
Qy 623 

Db 1147 

Qy 683 tgggttaaaatgcctgaatgg 703 

Db 1087 RRRRRRRRRRRRRRRRRRRRR 1067 



agttcccaaagcatgaaaaagaagaggagaagaaacctgagaaaggcatagtaccctgag 



1448 
382 
1388 
442 
1328 
502 
1268 
562 
1208 
622 
1148 
682 
1088 



AE001373 

LOCUS AE001373 12029 bp DNA INV 06-NOV-1998 

DEFINITION Plasmodium falciparum chromosome 2, section 10 of 73 of the 
complete sequence. r. 
ACCESSION AE001373 AE001362 £ 
VERSION AE001373.1 GI:3845097 * 



malaria parasite P. falciparum, 7* 
Plasmodium falciparum 

Eukaryota; Alveolata; Apicomplexa; Haemosporida; Plasmodium. 

1 (bases 1 to 12029) 

Gardner, M. J., Tettelin,H., Carucci,D.J., Cummings,L.M., Aravind,L,, 
Koonin,E,V., Shallom,S,, Mason,!., Yu,K., Fujii,C, Pederson,j., 
Shen,K., Jing,J,, Aston, C, Lai,Z., Schwartz, D.C, Pertea,M., 
Salzberg,S,, Zhou,L,, Sutton, G.G., Clayton, R., White,0., 
Smith, H.O., Fraser,C,M., Adams, M.D., Venter, J. C. and Hoffman, S. I. 
Chromosome 2 sequence of the human malaria parasite Plasmodium 
falciparum 

Science 282 (5391), 1126-1132 (1998) 
99021743 

Erratum: [[published erratum appears in Science 1998 Dec 
4;282(5395):1827]] 

2 (bases 1 to 12029) 
Gardner, M.J. 

Direct Submission 

Submitted (02-NOV-1998) The Institute for Genomic Research, 9712 
Medical Center Drive, Rockville, MD 20814, USA 

Location/Qualifiers 

1. ,12029 

/organism-'Plasmodium falciparum" 

/db_xref- " taxon : 5833 " 

/chromosome- "2" 

3709, .5236 

/gene-"PFB0110w n 

join(3709. .4318,5079. .5236) 

/gene-"PFB0110w" 

/note- "predicted by GlimmerM" 

/codon_start-l 

/product- "predicted integral membrane protein" 
/protein_id-"AAC7 1812.1" 



ORGANISM 

REFERENCE 
AUTHORS 



JOURNAL 
MEDLINE 
REMARK 



AUTHORS 

TITLE 

JOURNAL 

FEATURES 
source 



gene 
CDS 



Tue Sep 5 07:22:52 2000' 



us-08-984- 



■099-1. rge 



Page 



Db 568 GAGCACGAGAAACACGAAGTCGAATATCCGAAAATACCCGAGTACAAGGAAAAGCAAGAT 627 

Oy 434 -agaataagaaacataaagatgaagagtgccaggagtcacacgaatcgaaagagcacgaa 492 

Db 628 AAGAGTAAGGAACATAAAGATGAAGAGTGCCACGAGTCACACGAATCGAAAGATCACGAA 687 

Qy 493 gagtacgagaaagaaaaacccgatttccccaaatgggaaaagcctaaagggcacgagaaa 552 

Db 688 GAGTACGAGAAAGAAAAACCCAATTTCTTCAAATGGGAAAAGCCTAAAGAGCACGAGAAA 747 

Qy 553 cataaagccgaatatccgaaaatacctgagtgcaaggaaaaactagatgaggataaggaa 612 

Db 748 CATAAAGCCGAATATCCAAAAATACCCGAGTGCAAGGAAAAACAAGATGAGGATAAGGAA '807 

Qy 613 cataaacatgagttcccaaagcatgaaaaagaagaggagaagaaacctgagaaaggcata 672 

IIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIMIIIIIIIIIIIII.IIIII I 

Db 808 GATAAACATGAGTTCCCAAAGCATGAAAAAGAAGAGGAGAAGAAACCTGAGAAAGGCAGA 867 

Qy 673 gtaccctgagtgggttaaaatgcctgaatggccgaagtccatgtttactcagtctggctc 732 

Db 868 GTACCCTGAGTGGGTTAAAATGCCTGAATGGCCGAAGTCCATGTTTACTCAGTCTGGCTC 927 

Qy 733 gagcactaagccttaagccatatgacactggtgcatgtgccatcatcatgcagtaatttc 792 

Db 928 GAGCATTAAGCCTTAAGCCATATGACACTGGTGCATGTGCCATCATCATGCAGTAATTTC 987 

Qy 793 atgggatattgtaattatattgttaataaaaaagatggtgagtgggaaatgtgtgtgtgc 852 

Db 988 ATGGGATATCGTAATTATATTGTTAATAAAAAAGATGGTGAGTGGGAAATGTGTGTGTGC 1047 

Qy 853 attcatccatg-agcaatgctgaatctctttgcatgcatagagattctgaatggttatag 911 

Db 1048 ATTCATCCATGTAGCAATGCTGAATCTCTTTGCATGCATAGAGATTCTGAATGGTTATAG 1107 

Qy 912 tttatgttatatcgtttgttctagtgaaattaattttgaatgttgtatgtaatgtt 967 

Db 1108 TTTATGTTATATCGTTTGTTCTAGTGAAATTAATTTTGAATGTTGTATCTAATGTT 1163 



140338 

LOCUS 140338 1283 bp DNA PAT 13 -MAY- 

DEFINITION Sequence 17 from patent OS 5620882, 
ACCESSION 140338 
VERSION 140338.1 61:2082630 
KEYWORDS . 
SOURCE Unknown. 

ORGANISM Unknown. 

Unclassified. 
REFERENCE 1 (bases 1 to 1283) 

AUTHORS John,M. 

TITLE Genetically engineering cotton plants for altered fiber 
JOURNAL Patent: US 5620882-A 17 15-APR-1997; 
FEATURES Location/Qualifiers 
source 1. .1283 

/organism" "unknown" 
BASE COUNT 509 a 233 c 251 g 290 t 
ORIGIN 



Query Match 75.54; Score 730; DB 5; Length 1283; 

Best Local Similarity 82.5%; Pred. No. 4e-140; 

Matches 937; Conservative 0; Mismatches 30; Indels 169;' Gaps 

Qy • 1 ctttctatttggttaaccatggctcataactttcgtcatcctttcttccttttccaactt 60 

llllllllll II lllllilllllllllllll 1 1 1 1 1 1 1 1 1 1 1 1 1 M M I 

Db 28 CTTTCTATTTTGTAAACCATGGCTCATAACTTTTGTCATCCTTTCTTCCTTTTCCAACTT 87 

Qy 61 ttactcattactgtctcactaatgatcggtagccacaccgtctcgtcagcggctcgacat 120 

Db 88 TTACTCATTACTGTCTCACTAATAATCGGTAGTCACACCGTCTCGTCAGCGGCTCGACAT 147 



Qy 121 ttattccacacacaaacaacctcatcagagctgccacaattggcttcaaaatacgaaaag 180 

Db 148 TTATTCCAGACACAAACAACCTCATCAGAGCTGCCACAATTGGCTTCAAAATACGAAAAG 207 

Qy 181 cacgaagagtctgaatacaaacagccaaaatatcatgaagagtacccaaaacatgagaag 240 

Db 208 CACAAAGAGTCTGAATACAAACAACCAAAATATCACGAAAAGTACCCAAAACATGAGAAG 267 

Qy 241 cctgaaatgtacaaggaggaaaaacaaaaaccctgcaaacatcatgaagagtaccacgag 300 

Db 268 CCTAAAATGCACAAGGAGGAAAAACAAAAACCCTGCAAACATCATGAAGAGTACCACGAG 327 

Qy 301 tcacgcgaatcgaaggagcacgaagagtacgataaagaaaaacccgatttccccaaatgg 360 

Db 328 TCACGCGAATCGAAGGAGCACGAAGAGTACGATAAAGAAAAACCCGATTTCCCCAAATGG 387 

Qy 361 gaaaagcctaaagagcacgagaaacacgaagtcgaatatccgaaaatacccgagtacaag 420 

Db 388 GAAAAGCCTAAAGAGCACAAGAAACACGAAGTTGAATATCCGAAAATACCCGAGTACAAG 447 

Qy 421 gacaaacaagatg 433 

Db 448 GACAAACAAGATGAGGATAAGGAACATAAAAATGAAGAGTACCATGAATCACGCGAATCG 507 

Qy 434 433 

Db 508 AAGGAGCACGAAGAATACGAGAAAGAAAAACCCGAGTTCCCCAAATGGGAAAAGCCTAAA 567 

Qy 434 433 

Db 568 GAGCACGAGAAACACGAAGTCGAATATCCGAAAATACCCGAGTACAAGGAAAAGCAAGAT 627 

Qy 434 -agaataagaaacataaagatgaagagtgccaggagtcacacgaatcgaaagagcacgaa 492 

Db 628 AAGAGTAAGGAACATAAAGATGAAGAGTGCCACGAGTCACACGAATCGAAAGATCACGAA 687 

Qy 493 gagtacgagaaagaaaaacccgatttccccaaatgggaaaagcctaaagggcacgagaaa 552 

Db 688 GAGTACGAGAAAGAAAAACCCAATTTCTTCAAATGGGAAAAGCCTAAAGAGCACGAGAAA 747 

Qy 553 cataaagccgaatatccgaaaatacctgagtgcaaggaaaaactagatgaggataaggaa 612 

Db 748 CATAAAGCCGAATATCCAAAAATACCCGAGTGCAAGGAAAAACAAGATGAGGATAAGGAA 807 

Qy 613 cataaacatgagttcccaaagcatgaaaaagaagaggagaagaaacctgagaaaggcata 672 

Db 808 GATAAACATGAGTTCCCAAAGCATGAAAAAGAAGAGGAGAAGAAACCTGAGAAAGGCAGA 867 

Qy 673 gtaccctgagtgggttaaaatgcctgaatggccgaagtccatgtttactcagtctggctc 732 

Db 868 GTACCCTGAGTGGGTTAAAATGCCTGAATGGCCGAAGTCCATGTTTACTCAGTCTGGCTC 927 

Qy 733 gagcactaagccttaagccatatgacactggtgcatgtgccatcatcatgcagtaatttc 792 

Db 928 GAGCATTAAGCCTTAAGCCATATGACACTGGTGCATGTGCCATCATCATGCAGTAATTTC 987 

Qy 793 atgggatattgtaattatattgttaataaaaaagatggtgagtgggaaatgtgtgtgtgc 852 

Db 988 ATGGGATATCGTAATTATATTGTTAATAAAAAAGATGGTGAGTGGGAAATGTGTGTGTGC 1047 

Qy 853 attcatccatg-agcaatgctgaatctctttgcatgcatagagattctgaatggttatag 911 

Db 1048 ATTCATCCATGTAGCAATGCTGAATCTCTTTGCATGCATAGAGATTCTGAATGGTTATAG 1107 

Qy 912 tttatgttatatcgtttgttctagtgaaattaattttgaatgttgtatgtaatgtt 967 

Db 1108 TTTATGTTATATCGTTTGTTCTAGTGAAATTAATTTTGAATGTTGTATCTAATGTT 1163 



RESULT 5 
GBU34401 

LOCUS GBU34401 1699 bp DNA PLN 01-JAN-1996 

DEFINITION Gossypium barbadense FbLate-2 gene, complete cds. 
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25 


83.2 


8.6 76897 


42 


AC016179 


AC016179 Homo sapi 


26 


83 


8.6 169931 


11 


AC005822 


AC005822 Homo sapi 


27 


82 


8.5 162575 


41 


AC004086 


AC004086 Homo sapi 


28 


81.6 


8.4 83440 


53 


AC024285 


AC024285 Homo sapi 


29 


79.6 


8.2 80432 


51 


AC022680 


AC022680 Homo sapi 


30 


79.6 


8.2 222193 


32 


CNS01DSB 


AL121768 Homo sapi 


31 


79.4 


8,2 3489 


82 


KSU52Q64 


U52064 Kaposi's sa 


32 


79.4 


8.2 32207 


5 


AR065852 


AR065852 Sequence 


33 


79.4 


8.2 137508 


82 


KSU75698 


075698 Kaposi's sa 


34 


79.4 


8.2 141753 


49 


AC009323 


AC009323 Arabidops 


35 


79.4 


8.2 174383 


41 


AC009781 


AC009781 Homo sapi 


36 


78.8 


8.1 42839 


69 


AC027282 


AC027282 Homo sapi 


37 


78.6 


8.1 164520 


43 


AC020738 


AC020738 Homo sapi 


38 


78.2 


8.1 2000 


2 


AFO19082 


AF019082 Borrelia 


39 


78.2 


8.1 27323 


2 


AE000789 


AE000789 Borrelia 


40 


78.2 


8.1 110000 


31 


PPMAL4P1J 


AL034557 Plasmodiu 


41 


78.2 


8.1 166547 


41 


HS1164I10 


AL049537 Homo sapi 


42 


77.8 


8.0 43907 


34 


CELP36H12 


AF078790 Caenorhab 


43 


77.8 


8.0 119500 


51 


AC015927 


AC015927 Homo sapi 


44 


77.6 


8.0 89072 


55 


AC025070 


AC025070 Homo sapi 


45 


77.2 


8.0 73020 


51 


AC022851 


AC022851 Homo sapi 



118362 
LOCUS 



118362 



1283 bp DNA 



DEFINITION Sequence 17 from patent OS 5495070 



PAT 



07-OCT-1996 



ACCESSION 118362 

118362.1 61:1598717 



SOURCE Unknown. 
ORGANISM Unknown, 

Unclassified. 
REFERENCE 1 (bases 1 to 1283) 
John,M. 

Genetically engineering cotton plants for altered fiber 
Patent: US 5495070-A 17 27-FEB-1996; 
Location/Qualifiers 
1. .1283 
/organlsm-'unknovm" 
BASE COUNT 509 a 233 c 251 g 290 t 
ORIGIN 



Query Match 75.5%; Score 730; DB 5; Length 1283; 

Best Local Similarity 82.5*; Pred. No. 4e-140; 



TITLE 
JOURNAL 
FEATURES 

source 



Matches 


Qy 


1 


Db 


28 


Qy 


61 


Db 


88 


Qy 


121 


Db 


148 


Qy 


181 


Db 


208 


Qy 


241 


Db 


268 


Qy 


301 



0; Mismatches 30; Indels 169; Gaps 2; 



iiiiiiiiii ii mmiiiiiiimiii iiiiiiiiiiiiiiiiiiiiiiini 



1 1 i 1 1 1 1 1 1 1 [ 1 1 [ 1 1 1 hum iiiiiiiiiiiiiiiiiiiiiiiiiii 



ilium iniiiiiiiiiiiinmiiiiimiiiiiiinmiiiiiiiiiM 



in iiiiiiiiiiiiiiiiin iiiiiiiiiii in immiiimiiiiiii 



in inn iiiiiimiiiimmmiimiiiiiimiiiiiiiiimii 



iiiiiiiiiiiiiiiiimiiiijimiimmimimiiiiimiiiiiii 



Db 328 TCACGCGAATCGAAGGAGCACGAAGAGTACGATAAAGAAAAACCCGATTTCCCCAAATGG 387 



mimiiiiiiimi mimiiiiii miiiimiimiiiiiiiimi 

GAAAAGCCTAAAGAGCACAAGAAACACGAAGTTGAATATCCGAAAAT ACCCGAGTACAAG 4 4 7 



Db 


328 


Qy 


361 


Db 


388 


Qy 


421 


Db 


448 


Qy 


434 


Db 


508 


Qy 


434 


Db 


568 


Qy 




Db 


628 


Qy 


493 


Db 


688 


Qy 


553 


Db 


748 


Qy 


613 


Db 


808 


Qy 


673 


Db 


868 


Qy 


733 


Db 


928 


Qy 


793 


Db 


988 


Qy 


853 


Db 


1048 


Qy 


912 


Db 


1108 



568 GAGCACGAGAAACACGAAGTCGAATATCCGAAAATACCCGAGTACAAGGAAAAGCAAGAT 627 



III III! Illlllllllllllllllllll llllllllllllllllllll llllll 



II 1 1! Ill IN I Ml II II I HIM llllllllllllllllllll llllll I 111 
GAGTACGAGAAAGAAAAACCCAATTTCTTCAAATGGGAAAAGCCTAAAGAGCACGAGAAA 747 



f 1 1 1 1 1 1 1 1 1 1 1 ( 1 1 1 1 nun iiiiimiiiimi 1 1 1 1 ) 1 1 1 m 1 1 1 i I ! 



miimiiimimiiiiiiiimimiiiiiimiiiiiiiiimiii i 

GATAAACATGAGTTCCCAAAGCATGAAAAAGAAGAGGAGAAGAAACCTGAGAAAGGCAGA 867 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 III 1 1 1 1 1 1 1 1 III 1 1 1 1 III 1 1 1 1 III 1 1 1 1 III 1 1 1 1 1 1 1 1 1 II 
GTACCCTGAGTGGGTTAAAATGCCTGAATGGCCGAAGTCCATGTTTACTCAGTCTGGCTC 927 



lllll IIIIIIIIII llllll i 1 1 1 1 1 1 r 



imiim iimiiiiiiiiiiiimmiiiiiiiiiiiiiiiiiiimimi 

ATGGGATATCGTAATTATATTGTTAATAAAAAAGATGGTGAGTGGGAAATGTGTGTGTGC 1047 



imiimn miimiimmiimiiiimimiimmiiiiim 



1 1 1 : 1 1 1 1 1 1 1 1 1 : 1 1 ! M 1 1 1 : 1 1 1 : 1 1 1 1 M ! M 1 1 1 1 : 1 1 1 ! 1 1 lllllll 



RESULT 2 
121349 

LOCUS 121349 1283 bp DNA PAT 07-OCT- 

DEFINITION Sequence 17 from patent US 5521078. 

ACCESSION 121349 

VERSION 121349.1 61:1601703 

KEYWORDS 

SOURCE Unknown. 

ORGANISM Unknown. 

Unclassified. 
REFERENCE 1 (bases 1 to 1283) 

AUTHORS John,M. ' * 

TITLE Genetically engineering cotton plants for altered fiber 
JOURNAL Patent: US 5521078-A 17 28-MAY-1996; 
FEATURES Location/Qualifiers 
source 1. .1283 

/organism-'unknown" 
BASE COUNT 509 a 233 c 251 g 290 t 



Tue Sep 5 07:22:53 2000 



us-08-984-099-l.ini 



Page 10 



ATTORNEY/AGENT INFORMATION: 
NAME: Osman Ph.D., Richard A 
REGISTRATION NUMBER: 36,627 
REFERENCE/DOCKET NUMBER: UCB96-055 

TELECOMMUNICATION INFORMATION: 
TELEPHONE: (415)343-041 
TELEFAX: (415)343-4342 
INFORMATION FOR SEQ ID NO: 2: 

SEQUENCE CHARACTERISTICS: 
LENGTH: 2277 base pairs 
TYPE: nucleic acid 
STRANDEDNESS: double 
TOPOLOGY: linear 

MOLECULE TYPE: CDNA 
US-08-676-967-2 



Query Match 5.6%; Score 54.2; DB 2; Length 2277; 

Best Local Similarity 31.3*; Pred. No. 1.2e-05; 

Matches 117; Conservative 66; Mismatches 188; Indels 3; Gaps 

Qy 317 agcacgaagagtacgataaagaaaaacccgatttccccaaatgggaaaagcctaaagagc 376 

h I: I: I I 11:1 |:: r :||:||: : I ||: 

Db 575 ARGAYAARTAYAARGAYACNCARWSNGTNWSNGCNATHGGNGARGARAMWSNCAYGARW 634 

Qy 377 acgagaaacacgaagtcgaatatccgaaaatacccgagtacaaggacaaacaagatgaga 436 

I: I II II: I I Ml: : II: I MM Ml II: 
Db 635 SNAARCAYCARGARWSNGTNAARAARAARGGNMGNGARGARGARGAYAIGGARGARGARG 694 

Qy 437 ataagaaacataaagatgaagagtgccaggagtcacacgaatcgaaagagcacgaagagt 496 

I II I I: I 11:11 II : I lh : hi II MM 

Db 695 ARAAYGAYGAYGAYGAYGAYGAYGAYGAYGARGARGAYGGNGTNTTYGAYGAYGARGAYG 754 

Qy 497 acgagaaagaaaaacccgatttccccaaatgggaaaagcctaaagggcacgagaaacata 556 

I II: MM :|l : : ||: MM : : MM:: 
Db 755 ARGARGARGARAAYATHGARW— SNAARGTNACNAARCCNGTNCARATHCARAARMGNG 811 

Qy 557 aagccgaatatccgaaaatacctgagtgcaaggaaaaactagatgaggataaggaacata 616 

I I: II I I::: : II I II IMM:: || : 

Db 812 CNGTNAARMGNCCNGCNCCNGCNAARWSNWSNGAYCAYWSNGARGARGAYWSNGAYYTNG 871 

Qy 617 aacatgagttcccaaagcatgaaaaagaagaggagaagaaacctgagaaaggcatagtac 676 

1:1 : : I Ml IMM M MM MM 
Db 872 ARGARWSNGAYWSNATHGAYGAYGGNGARGARYTNGCNCARWSNGAYACNWSNACNGARG 931 

Qy 677 cctgagtgggttaa 690 

:| :| : |: 
Db 932 ARCARGARGAYAAR 945 



RESULT 14 
US-08-676-974-2 
Sequence 2, Application US/08676974 
Patent No. 5770422 
GENERAL INFORMATION: 
APPLICANT: COLLINS, KATHLEEN 
TITLE OF INVENTION: Human Telomerase 
NUMBER OF SEQUENCES: 10 
CORRESPONDENCE ADDRESS; 
ADDRESSEE: Science s Technology Law Group 
STREET: 268 Bush Street, Suite 3200 
CITY: San Francisco 
STATE: CA 
COUNTRY: USA 
ZIP: 94104 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1. 30 

CURRENT APPLICATION DATA: 
APPLICATION NUMBER: US/08/676,974 
FILING DATE: 



CLASSIFICATION: 530 
ATTORNEY/AGENT INFORMATION: 

NAME: Osman Ph.D., Richard A 

REGISTRATION NUMBER: 36,627 

REFERENCE/DOCKET NUMBER; UCB96-055 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (415)343-4341 

TELEFAX: (415)343-4342 
INFORMATION FOR SEQ ID NO: 2: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 2277 base pairs 

TYPE: nucleic acid 

STRANDEDNESS: double 

TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
US-08-676-974-2 



Query Match 5.6%; 
Best Local Similarity 31.3%; 
117; Conservative 6 



I: I: I: I I M 



Matches 


Qy 


317 


Db 


575 


Qy 


377 


Db 


635 


Qy 


437 


Db 


695 


Qy 


497 


Db 


755 


Qy 


557 


Db 


812 


Qy 


617 


Db 


872 


Qy 


677 


Db 


932 



Score 54.2; DB 2; 
Pred, No. 1.2e-05; 
5; Mismatches 188; 



Length 2277; 

Indels 3; Gaps 1; 



:||:||: : I lh 



h I II II: I I 



III I I: I Ihll II : I II: : hi 



II hlhlT 



II I II 1 1 : 1 1 : : II 



I Ml 



RESULT 15 
US-09-098-487-2 
Sequence 2, Application US/09098487 
Patent No. 5917025 
GENERAL INFORMATION: 
APPLICANT: COLLINS, Kathleen 
TITLE OF INVENTION: Human Telomerase 
NUMBER OF SEQUENCES: 11 
CORRESPONDENCE ADDRESS: 
ADDRESSEE: Science & Technology Law Group 
STREET: 268 Bush Street, Suite 3200 
CITY: San Francisco 
STATE: CA 
COUNTRY: USA 
ZIP: 94104 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
SOFTWARE: Patentln Release #1.0, Version #1 
CURRENT APPLICATION DATA: 
APPLICATION NUMBER: US/09/098,487 



Tue Sep 5 07:22:53 2000 



us-08-984-09M.rni 



Page 8 



I I Ml I! I III II 1 1 1 1 I II I II II I II 
Db 19535 AGGAGCAGGAGTTAGAGGAGCAGGAGCAGGAGTTAGAGGAGCAGGAGCAGGAGTTAGAGG 19476 

Qy 398 atccgaaaatacccgagtacaaggacaaacaagatgagaataagaaacataaagatgaag. 457 

I I I I I III I II I I II III I II I I I II II 

Db 19475 AGCAGGAGGTGGAAGAGCAAGAGCAGGAGGTGGAAGAGCAAGAGCAGGAGCAGGAAGAGC 19416 

Qy 458 agtgccaggagtcacacgaatcgaaagagcacgaagagtacgagaaagaaaaacccgatt 517 

II III II I MINI II II I III I II II || I 

Db 19415 AGGAATTAGAGGAGGTGGAGGAGCAAGAGCAGGAGCAGGAGGAGCAGGAGGAGCAGGAGT 19356 

Qy 518 tccccaaatgggaaaagcctaaagggcacgagaaacataaagccgaatatccgaaaatac 577 

I I Mil III III III III I I I I III I I Ml I 
Db 19355 TAGAGGAGGTGGAAGAGCAGGAAGAGCAGGAGTTAGAGGAGGTGGAAGAGCAGGAAGAGC 19296 

Qy 578 ctgagtgcaaggaaaaactagatgaggataaggaacataaacatgagttcccaaagcatg 637 

mi mi iii im n i i mm i iiii i 

Db 19295 AGGAGTTAGAGGAGGTGGAAGAGCAGGAGCAGCAGGAGTTAGAGGAGGTGGAAGAGCAGG 19236 

Qy 638 aaaaagaagaggagaagaaacctgagaaaggcatagt 674 

I I I I'll II II III I I I II 
Db 19235 AGCAGCAGGGGGTGGAACAGCAGGAGCAGGAGACGGT 19199 



2SULT 9 
5-08-931-999-4 

Sequence 4, Application US/08931999 
Patent No, 6043219 
GENERAL INFORMATION: 

APPLICANT: Iandolo, John J. 

APPLICANT: Crupper, Scott S. 

TITLE 'OF INVENTION: Broad Spectrum Chemotherapeutic Peptide 

NUMBER OF SEQUENCES: 4 



ADDRESSEE: Hovey, Williams, Timmons s Collins 

STREET: 2405 Grand Boulevard, Suite 400 

CITY: Kansas City 

STATE: Missouri 

COUNTRY: U.S.A. 

ZIP: 64108 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release 11,0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/931,999 

FILING DATE: 

CLASSIFICATION: 514 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/710,561 

FILING DATE: 19-SEP-1996 
ATTORNEY/AGENT INFORMATION: 

NAME: Collins, John M. 

REGISTRATION NUMBER: 26,262 

REFERENCE/DOCKET NUMBER: 25043 -A 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 816/474-9050 

TELEFAX: 816/474-9057 
INFORMATION FOR SEQ ID NO: 4: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 6755 base pairs 

TYPE: nucleic acid 

STRANDEDNESS: double 

TOPOLOGY: unknown 
MOLECULE TYPE: DNA (genomic) 
HYPOTHETICAL: NO 
ANTI -SENSE: NO 
ORIGINAL SOURCE: 

ORGANISM: Staphylococcus aureus 

STRAIN: UT0007 
-08-931-999-4 



Query Match 6.5%; Score 62.6; DB 5; Length 6755; 

Best Local Similarity 50.1%; Pred. No. l.le-07; 

Matches 211; Conservative 0; Mismatches 204; Indels 6; Gaps 

Qy 246 aatgtacaaggaggaaaaacaaaaaccctgcaaacatcatgaagagtaccacgagtcacg 305 

II I I II II I llllllll I III I III III I I II I 

Db 6203 AACGAAAAACGACAAGAAACAAAACACAAAGAGAGAAAAAGAAAAGAAAAAAAACGCAGG 6262 

Qy 306 cgaatcgaaggagcacgaagagtacgataaagaaaaacccgatttccccaaatgggaaaa 365 

III I II III III II IIII II I I III 

Db 6263 AGAAACAAA AGCAGGGAAACAAGCAAAAAAACCCGCAAAGACGAGCCCACAAAAG 6317 

Qy 366 gcctaaagagcacgagaaacacgaagtcgaatatccgaaaatacccgagtacaaggacaa 425 

I II I III III I- II II I II IIII I I IIII I I 

Db 6318 AGGAAGAAACCGCGAAAAAAAGAAAAAAAAAAAGCCAAAAACAAAAAGGAACAACAAGCA 6377 

Qy 426 acaagatgagaataagaaacataaagatgaagagtgccaggagtcacacgaatcgaaaga 485 

I IIII II Mil II III I II II I III I III I 
Db 6378 AAAAGAAGAAGAAAAGGGA-AGAAAAAAGAGGAAAACAGAGAGAACAAAAGCCAAAAAAA 6436 

Qy 486 gcacgaagagtacgagaaagaaaaacccgatttccccaaatgggaaaagcctaaagggca 545 

mm ii limn ii i i mm iii mi iiii 

Db 6437 CAACGAAGCGGAGAAGAAAGAGCAAAAACAAACACGCAAAAGGGCAAAAAAGCACGACCA 6496 

Qy 546 cgagaaacataaagccgaatatccgaaaatacctgagtgcaaggaaaaactagatgagga 605 

III IIII Mill II I I Mill III IM 
Db 6497 AACAAAAGACAAACAAGCACAACAAAGCGAACAACACAACCCAAAAAAAAGAGAACAGAA 6556 

Qy 606 taaggaacataaacatgagttcccaaagcatgaaaaagaagaggagaagaaacctgagaa 665 

ii mi iii iiiii limn i i ii mm in 

Db 6557 AAACAAACACGAGAACAACAAAGAAAGGGAGGCAAAAGAACAAAACAAAAAACACGAAGA 6616 

Qy 666 a 666 

I 

Db 6617 A 6617 " 



RESULT 10 
US-08-257-073-4 
Sequence 4, Application US/08257073 
Patent No. 5766597 
GENERAL INFORMATION: 
APPLICANT: Paoletti, Enzo 
APPLICANT: de Taisne, Charles 
APPLICANT: Tine, John A. 

TITLE OF INVENTION: MALARIA RECOMBINANT POXVIRUS VACCINE 
NUMBER OF SEQUENCES: 143 



ADDRESSEE: Curtis, Morris s Safford, P.C, 
STREET: 530 Fifth Avenue, 25th Floor 
CITY: New York 
STATE: New York 

COUNTRY: UNITED STATES OF AMERICA 

ZIP: 10036 
COMPUTER READABLE FORM: 
■ MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/257,073 

FILING DATE: 09-JUN-1994 

.CLASSIFICATION: 424 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/075,783 

FILING DATE: U-JUN-1993 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/852,305 

FILING DATE: 18-MAR-1992 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/672,183 

FILING DATE: 20-MAR-1991 



Tue Sep 5 07:22:53 2000 
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Db 


268 


Qy 


301 


Db 




Qy 


361 


Db 
Qy 


388 
421 


Db 


448 


Qy 


434 


Db 


508 


Qy 


434 


Db 


568 


Qy 


434 


Db 


628 


Qy 


493 


Db 


688 


Qy 


553 


Db 


748 


Qy 


613 


Db 


808 


Qy 


673 


Db 


868 


Qy 


733 


Db 


928 


Qy 


793 


Db 


988 


Qy 


853 


Db 


1048 


Qy 


912 


Db 


1108 



268 CCTAAAATGCACAAGGAGGAAAAACAAAAACCCTGCAAACATCATGAAGAGTACCACGAG 327 



IIIMMIIIIIIIIIIMIIIMIMIIIIIIIMMIIIINIIIMIMIMIMM 



imiiiiiiiiiinii iiiiiimmi iniiiiiiiiiiiiiiiiiiiiiiii 



568 GAGCACGAGAAACACGAAGTCGAATATCCGAAAATACCCGAGTACMGGAAAAGCAAGAT 627 



III III! llllllllllllllllllllll llllllllllllllllllll MINI 



lllllllllllllllllllll Mill llllllllllllllllllll MINIUM 



lllllllllllllllll llllllll llllllllllllllll 1 1 1 1 1 M 1 1 1 1 1 ! 1 1 



lllllllllllllllllllllllllllllllllllllllllllllllllllllllll I 



llllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 



Mill llllllllllllllllllllllllllllllllllllllllllllllllllllll 



III! Illlllllllllllllllllllllllllllllllllllllllllllllll 



lllllllllll llllllllllllllllllllllllllllllllllllllllllllllll 



1 1 1 [ 1 1 1 1 M S [ 1 1 1 1 1 1 1 E 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 M ! I MUM! 



RESULT 6 
US-08-232-463-14/C 

Sequence 14, Application OS/08232463 

Patent No. 5670367 

GENERAL INFORMATION: 



APPLICANT 
APPLICANT 
APPLICANT 



SCHEIFLINGER, F. 
FALRNER F G 
TITLE OF INVENTION: RECOMBINANT FOWLPOX VIRUS 
NUMBER OF SEQUENCES: 52 
CORRESPONDENCE ADDRESS: 
ADDRESSEE: Foley -i Lardner 
STREET: 1800 Diagonal Road, Suite 500 
CITY: Alexandria 



STATE: VA 

COUNTRY: USA 

ZIP: 22313-0299 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/232,463 

FILING DATE: 

CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US/07/935,313 

FILING DATE: 

APPLICATION NUMBER: EP 91 114 300.6 

FILING DATE: 26*1991 
ATTORNEY/AGENT INFORMATION: 

NAME: BENT, Stephen A. 

REGISTRATION NUMBER: 29,768 

REFERENCE/DOCKET NUMBER: 30472/114 IMMU 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (703)836-9300 

TELEFAX: (703)683-4109 

TELEX: 899149 
INFORMATION FOR SEQ ID NO: 14: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 7218 base pairs 

TYPE: nucleic acid 

STRANDEDNESS : single 

TOPOLOGY: linear 

IMMEDIATE SOURCE: 
CLONE: pTZgpt-FlS 
US-08-232-463-14 



Query Match 10.2%; Score 99; DB 1; Length 7218; 

Best Local Similarity 8.2*; Pred. No. 1.7e-17; 

36; Conservative 255; Mismatches 150; indels 0; Gaps 



Qy 


263 


Db 


1507 


Qy 


323 


Db 


1447 


Qy 


383 


Db 


1387 


Qy 


443 


Db 


1327 


Qy 


503 


Db 


1267 


Qy 


563 


Db 


1207 


Qy 


623 


Db 


1147 


Qy 


683 


Db 


1087 



ii iii i ii i him 



i mi 



mi ii in i 



1388 

383 aacacgaagtcgaatatccgaaaatacccgagtacaaggacaaacaagatgagaataaga 442 

1328 



4 4 3 aacataaagatgaagagtgccaggagtcacacgaatcgaaagagcacgaagagtacgaga 502 

1268 



503 aagaaaaacccgatttccccaaatgggaaaagcctaaagggcacgagaaacataaagccg 562 



1208 



563 aatatccgaaaatacctgagtgcaaggaaaaactagatgaggataaggaacataaacatg 622 

1148 
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lllllllllllllllllllllll llllllll IIMIMIIMIMIIIIIIIIIIIII 

Db 88 TTACTCATTACTGTCTCACTAATAATCGGTAGTCACACCGTCTCGTCAGCGGCTCGACAT 147 

Oy 121 ttattccacacacaaacaacctcatcagagctgccacaattggcttcaaaatacgaaaag 180 

llllllll lllllllllllllllllllllllllllllllllllllllllllllllllll 

Db 148 TTATTCCAGACACAAACAACCTCATCAGAGCTGCCACAATTGGCTTCAAAATACGAAAAG 207 

Qy 181 cacgaagagtctgaatacaaacagccaaaatatcatgaagagtacccaaaacatgagaag 240 

III lllllllllllllllllll lllllllllll III I M I M ! 1 1 1 1 1 1 1 1 1 M 1 1 
Db 208 CACAAAGAGTCTGAATACAAACAACCAAMTATCACGAAAAGTACCCAAAACATGAGAAG 267 

Qy 241 cctgaaatgtacaaggaggaaaaacaaaaaccctgcaaacatcatgaagagtaccacgag 300 

III HIM llllllllllllllllllllllllllllllllllllllllllllllllll 
Db 268 CCTAAAATGCACAAGGAGGAAAAACAAMACCCTGCAAACATCATGAAGA6TACCACGAG 327 

Qy 301 tcacgcgaatcgaaggagcacgaagagtacgataaagaaaaacccgatttccccaaatgg 360 

llllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 
Db 328 TCACGCGMTCGAAGGAGCACGAAGAGTACGATAMGAAAAACCCGATTTCCCCAAATGG 387 

Qy 361 gaaaagcctaaagagcacgagaaacacgaagtcgaatatccgaaaatacccgagtacaag 420 

1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1! I lllllllllllll I [ 1 ) J 1 1 ! 1 1 1 1 1 ! 1 1 1 1 1 f 1 1 ! 11 II 
Db 388 GAAAAGCCTAAAGAGCACAAGAAACACGAAGTTGAATATCCGAAAATACCCGAGTACAAG 447 

Qy 421 gacaaacaagatg 433 

lllllllllllll 

Db 448 GACAAACAAGATGAGGATAAGGAACATAAAAATGAAGAGTACCATGAATCACGCGAATCG 507 

Qy 434 433 

Db 508 AAGGAGCACGAAGAATACGAGAAAGAAAAACCCGAGTTCCCCAAATGGGAAAAGCCTAAA 567 

Qy 434 433 

Db 568 GAGCACGAGAAACACGAAGTCGMTATCCGAAAATACCCGAGTACAAGGAAAAGCAAGAT 627 

Qy 434 -agaataagaaacataaagatgaagagtgccaggagtcacacgaatcgaaagagcacgaa 492 

III Mil I M I ! I M 1 1 1 1 1 1 1 1 1 1 M 1 1 IIMIIIIIIIIIIIIIIII HUM 
Db 628 AAGAGTAAGGAACATAAAGATGAAGAGTGCCACGAGTCACACGAATCGAAAGATCACGAA 687 

Qy 493 gagtacgagaaagaaaaacccgatttccccaaatgggaaaagcctaaagggcacgagaaa 552 

lllllllllllllllllllll Mill IIMIIIIIIIIIIIIIIII MINIMI 
Db 688 GAGTACGAGAAAGAAAAACCCAATTTCTTCAAATGGGAAAAGCCTAAAGAGCACGAGAAA 747 

Qy 553 cataaagccgaatatccgaaaatacctgagtgcaaggaaaaactagatgaggataaggaa 612 

lllllllllllllllll llllllll llllllllllllllll llllllllllllllll 

Db 748 CATAAAGCCGAATATCCAAAAATACCCGAGTGCAAGGAAAAACAAGATGAGGATAAGGAA 807 

Qy 613 cataaacatgagttcccaaagcatgaaaaagaagaggagaagaaacctgagaaaggcata 672 

lllllllllllllllllllllllllllllllllllllllllllllllllllllllll I 
Db 808 GATAAACATGAGTTCCCAAAGCATGAAAAAGAAGAGGAGAAGAAACCTGAGAAAGGCAGA 867 

Qy 673 gtaccctgagtgggttaaaatgcctgaatggccgaagtccatgtttactcagtctggctc 732 

IMIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIII 
Db 868 GTACCCTGAGTGGGTTAAAATGCCTGAATGGCCGAAGTCCATGTTTACTCAGTCTGGCTC 927 

Qy 733 gagcactaagccttaagccatatgacactggtgcatgtgccatcatcatgcagtaatttc 792 

inn iiiiiiiiiiiiimiiiiiiiiiiiiiiiiiiiiiiiiimiiiiiimi 

Db 928 GAGCATTAAGCCTTAAGCCATATGACACTGGTGCATGTGCCATCATCATGCAGTAATTTC 987 
Qy 793 atgggatattgtaattatattgttaataaaaaagatggtgagtgggaaatgtgtgtgtgc 852 

iiimm iiiiiiiiiiimmiiiiiiiiiiiimiiiiimiiiimiii 

Db 988 ATGGGATATCGTAATTATATTGTTAATAAAAAAGATGGTGAGTGGGAAATGTGTGTGTGC 1047 

Qy 853 attcatccatg-agcaatgctgaatctctttgcatgcatagagattctgaatggttatag 911 

lllllllllll milllllllimillllllllllllllllllllllllllllllll 

Db 1048 ATTCATCCATGTAGCAATGCTGAATCTCTTTGCATGCATAGAGATTCTGAATGGTTATAG 1107 

Qy 912 tttatgttatatcgtttgttctagtgaaattoattttgaatgttgtatgtaatgtt 967 

llllllllllllllllllllllllllllllllllllllllllllllll lllllll . 
Db 1108 TTTATGTTATATCGTTTGTTCTAGTGAAATTAATTTTGAATGTTGTATCTAATGTT 1163 



3-08-298-829-17 

Sequence 17, Application US/08298829 
Patent No. 5620882 
GENERAL INFORMATION: 
APPLICANT: John, Maliyakal E. 

TITLE OF INVENTION: GENETICALLY ENGINEERING COTTON 
TITLE OF INVENTION: PLANTS FOR ALTERED FIBER 
NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Nicholas J, Seay, Quarles 5 Brady 

STREET: P.O. Box 2113, First Wisconsin Plaza 

CITY: Madison 

STATE; Wisconsin 

COUNTRY: USA 

ZIP: 53701 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC -DOS/MS -DOS 

SOFTWARE: Microsoft Word 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/298,829 

FILING DATE: 19-OCM994 

CLASSIFICATION: 800 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/885,970 

FILING DATE: 18-MAY-1992 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/617,239 

FILING DATE: 21-NOV-1990 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/253,243 

FILING DATE: 04-OCM988 
ATTORNEY/AGENT INFORMATION: 

NAME: Seay, Nicholas J. 

REGISTRATION NUMBER: 27,386 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (608) 283-2478 

TELEFAX: (608) 251-5139 
INFORMATION FOR SEQ ID NO: 17: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 1283 base pairs 

TYPE: nucleic acid 

STRANDEDNESS: double 

TOPOLOGY: linear 
MOLECULE TYPE: CDNA 
HYPOTHETICAL: NO 
ANTI-SENSE: NO 
ORIGINAL SOURCE: 

ORGANISM: Gossypium hirsutum 

STRAIN: Coker 312 

DEVELOPMENTAL STAGE: 15 day old fiber cells 
TISSUE TYPE: fiber cells 
IMMEDIATE SOURCE: 
LIBRARY: CKFB15 
CLONE: E9 
3-08-298-829-17 



Query Match 75.5%; Score 730; DB 1; Length 1283; 

Best Local Similarity 82.5%; Pred. No. 8,9e-188; 

Matches 937; Conservative 0; Mismatches 30; Indels 169; Gaps 2; 

Qy 1 ctttctatttggttaaccatggctcataactttcgtcatcctttcttccttttccaactt 60 

llllllllll II lllllllllllllllll immillllllllllllllllll 
Db 28 CTTTCTATTTTGTAAACCATGGCTCATAACTTTTGTCATCCTTTCTTCCTTTTCCAACTT 87 

Qy 61 ttactcattactgtctcactaatgatcggtagccacaccgtctcgtcagcggctcgacat 120 

iiiiiiiiiiiiiiiiiiiii ilium iimmiiiiiiiiimiiiiiii 

Db 88 TTACTCATTACTGTCTCACTAATAATCGGTAGTCACACCGTCTCGTCAGCGGCTCGACAT 147 

Qy 121 ttattccacacacaaacaacctcatcagagctgccacaattggcttcaaaatacgaaaag 180 

llllllll llllllllllllllllimilllllllimillllllllllllllllll 
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TISSUE TYPE: fiber cells 
IMMEDIATE SOURCE: 
LIBRARY: CKFB15 
CLONE: E9 
US-07-885-970A-17 



Query Match 75.5%; Score 730; DB 1; Length 1283; 

Best Local Similarity 82.5%; Pred. No. 8.9e-188; 

Matches 937; Conservative 0; Mismatches 30; Indels 169; Gaps 

Qy 1 ctttctatttggttaaccatggctcataactttcgtcatcctttcttccttttccaactt 60 

llllllllll II lllllllllllllllllll llllllllllllllllllllllllll 
Db 28 CTTTCTATTTTGTAAACCATGGCTCATAACTTTTGTCATCCTTTCTTCCTTTTCCAACTT 87 

Oy 61 ttactcattactgtctcactaatgatcggtagccacaccgtctcgtcagcggctcgacat 120 

lllllllllllllllllllllll MINIM lllllllllllllllllllllllllll 
Db 88 TTACTCATTACTGTCTCACTAAIAATCGGTAGTCACACCGTCTCGTCAGCGGCTCGACAT 147 

Qy 121 ttattccacacacaaacaacctcatcagagctgccacaattggcttcaaaatacgaaaag 180 

MINI 1 1 1 [ f 1 1 1 1 M 1 1 M 1 1 1 1 M f I M 1 1 M II 1 1 1 1 ! 1 1 f 1 1 1 1 1 M 1 1 1 1 1 
Db 148 TTATTCCAGACACAAACAACCTCATCAGAGCTGCCACAATTGGCTTCAAAATACGAAAAG 207 

Qy 181 cacgaagagtctgaatacaaacagccaaaatatcatgaagagtacccaaaacatgagaag 240 

III lllllllllllllllllll 1 1 II 1 1 1 [ 1 1 1 III 1 1 1 1 ! 1 1 !! 1 1 1 1 M 1 1 1 1 1 
Db 208 CACAAAGAGTCTGAATACAAACAACCAAAATATCACGAAAAGTACCCAAAACATGAGAAG 267 

Qy 241 cctgaaatgtacaaggaggaaaaacaaaaaccctgcaaacatcatgaagagtaccacgag 300 

III Hill IMMIMMMIMMIMMMMIMIIMIMMMIMMMI 
Db 268 CCTAAAATGCACAAGGAGGAAAAACAAAAACCCTGCAAACATCATGAAGAGTACCACGAG 327 

Qy 301 tcacgcgaatcgaaggagcacgaagagtacgataaagaaaaacccgatttccccaaatgg 360 

MIIIMIIIMMMIMIIMIMMIMMIIMIIIMMIMMMIMMMI 
Db 328 TCACGCGAATCGAAGGAGCACGAAGAGTACGATAAAGAAAAACCCGATTTCCCCAAATGG 387 

Qy 361 gaaaagcctaaagagcacgagaaacacgaagtcgaatatccgaaaatacccgagtacaag 420 

IINIIIIIIIIIIIIII lllllllllllll lllllllllllllllllllllllllll 
Db 388 GAAAAGCCTAAAGAGCACAAGAAACACGAAGTTGAATATCCGAAAATACCCGAGTACAAG 447 

Qy 421 gacaaacaagatg 433 

lllllllllllll 

Db 448 GACAAACAAGATGAGGATAAGGAACATAAAAATGAAGAGTACCATGAATCACGCGAATCG 507 

Qy 434 - 433 

Db 508 AAGGAGCACGAAGAATACGAGAAAGAAAAACCCGAGTTCCCCAAATGGGAAAAGCCTAAA 567 

Qy 434 433 

Db 568 GAGCACGAGAAACACGAAGTCGAATATCCGAAAATACCCGAGTACAAGGAAAAGCAAGAT 627 

Qy 434 -agaataagaaacataaagatgaagagtgccaggagtcacacgaatcgaaagagcacgaa 492 

III Mil IIIIIIIIIIIMIIIIIIIII IMIMMMIMIIIIII llllll 
Db 628 AAGAGTAAGGAACATAAAGATGAAGAGTGCCACGAGTCACACGAATCGAAAGATCACGAA 687 

Qy 493 gagtacgagaaagaaaaacccgatttccccaaatgggaaaagcctaaagggcacgagaaa 552 

1 1 1 1 1 M I! 1 1 1 1 1 1 II 1 1 1 1 Mill lllllllllllllllllll llllllllll 
Db 688 GAGTACGAGAAAGAAAAACCCAATTTCTTCAAATGGGAAAAGCCTAAAGAGCACGAGAAA 747 

Qy 553 cataaagccgaatatccgaaaatacctgagtgcaaggaaaaactagatgaggataaggaa 612 

iiiiiiiiiiiiiini muni i M m 1 1 1 1 i 1 1 1 r 1 1 inimimimi 

Db 748 CATAAAGCCGAATATCCAAAAATACCCGAGTGCAAGGAAAAACAAGATGAGGATAAGGAA 807 

Qy 613 cataaacatgagttcccaaagcatgaaaaagaagaggagaagaaacctgagaaaggcata 672 

I M 1 1 M I M 1 1 M 1 1 1 1 1 1 M 1 1 1 1 [ 1 1 1 1 i 1 1 1 1 1 1 1 1 1 f 1 1 1 1 1 1 1 1 1 i 1 1 f 1 1 I 
Db 808 GATAAACATGAGTTCCCAAAGCATGAAAAAGAAGAGGAGAAGAAACCTGAGAAAGGCAGA 867 

Qy 673 gtaccctgagtgggttaaaatgcctgaatggccgaagtccatgtttacfcagtctggctc 732 

1 1 f 1 1 1 1 1 1 M 1 1 1 1 M t M 1 1 1 M I M I llllllllll M 1 1 M M 1 1 II n 1 1 n II 
Db 868 GTACCCTGAGTGGGTTAAAATGCCTGAATGGCCGAAGTCCATGTTTACTCAGTCTGGCTC 927 

Qy 733 gagcactaagccttaagccatatgacactggtgcatgtgccatcatcatgcagtaatttc 792 

Mill llllllllimilllllllllllllllllllllllMIIIIIIIIIIIIIIII 



Db 


928 


Qy 
Db 


793 
988 


Qy 


853 


Db 


1048 


Qy 


912 


Db 


1108 



928 GAGCATTAAGCCTTAAGCCATATGACACTGGTGCATGTGCCATCATCATGCAGTAATTTC 987 



iiiiiiiii iiiiiiiiiMMmiiimiMiMiiiiiiiiiiiiiiiiiiiiii 

ATGGGATATCGTAATTATATTGTTAATAAAAAAGATGGTGAGTGGGAAATGTGTGTGTGC 1047 



mimmi miMiiiiiiiiiimiimiiiiMiiiiiiiMiiiiiiMi 



I M ! 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 M M 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 MUM 



RESULT 2 
US-08-298-687A-17 
Sequence 17, Application US/08298687A 
Patent No. 5521078 
GENERAL INFORMATION: 
APPLICANT : John, Maliyakal E. 
TITLE OF INVENTION: GENETICALLY ENGINEERING COTTON 
TITLE OF INVENTION: PLANTS FOR ALTERED FIBER 
NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Nicholas J. Seay, Quarles s Brady 
STREET: P.O. Box 2113, First Wisconsin Plaza 
CITY: Madison 
STATE: Wisconsin 
COUNTRY: USA 
ZIP: 53701 



MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC -DOS/MS -DOS 

software: Microsoft Word 

. CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/298, 6 87A 

FILING DATE: 

CLASSIFICATION: 800 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/617,239 

FILING DATE: 21-NOV-1990 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/253,243 

FILING DATE: 04-OCH988 
ATTORNEY/AGENT INFORMATION: 

NAME: Seay, Nicholas J. 

REGISTRATION NUMBER: 27,386 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (608) 283-2478 

TELEFAX: (608) 251-5139 
INFORMATION FOR SEQ ID NO: 17: 
SEQUENCE CHARACTERISTICS: 

LENGTH; 1283 base pairs 

TYPE: nucleic acid 

STRANDEDNESS: double 

TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
HYPOTHETICAL: NO 
ANTI-SENSE: NO 
ORIGINAL SOURCE: 

ORGANISM: Gossypium hirsutum 

STRAIN: Coker 312 

DEVELOPMENTAL STAGE: 15 day old fiber cells 
TISSUE TYPE: fiber cells 

IMMEDIATE SOURCE: 
LIBRARY: CRFB15 
CLONE: E9 
US-08-298-687A-17 



Query Match 



75.5%; Score 730; DB 1; Length 1283; 
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BASE COUNT 
ORIGIN 



/clone-"BACR29P01 n 
/note-"end : T7" 
289 a 155 c 118 g 307 t 232 others 



Query Match 3,4%; Score 105; DB 122; Length 1101; 

Best Local Similarity 36.9%; Pred, No. 1.3e-06; 

Matches 229; Conservative 104; Mismatches 285; Indels 3; Gaps 

Qy 324 tttgagtttgtatgatgataagtcgacataancgaaatatggtgtgatcttcacttttga 383 

h: I: : II h lh II I : I : h II III: III 
Db 1082 TDWTADAAGRAATTTTDTATAGKTGAGAATWRTGTKTWTTKATTTTTTTTTTDWRTGTTA 1023 

Qy 384 actttgataagtcaccaaactttaacaaagtttgattgtgtacatatatatatatatctt 443 

II I hi : :h :|:|: I lh II I :::|:|::: I h 

Db 1022 TRRRTGTTTAKTKWATRARWWTWADGAIGAGTTDGTTTTTRTTTRTKDKAWAWWWTTTTK 963 

Qy 444 caaattttataataaaaattgtgtttaaataatttacagttatattatttttttatctct 503 

Ml: :h II : :ll:: :llh lllll I I : lh: :|:: : I 
Db 962 WTGWTTWGWTWTTAKDTRTTDWTWTTADKTAATTGAGTGAAWAKTTRWAWTWWATAAKAT 903 

Qy 504 aattttatttgtcgccaaatttttagttgatattttaacataaaaaaaattgtacacatt 563 

:| :hhl I :hllh: : I II : I I II :| I Ml 
Db 902 TRTAGARTKTKTGTRRATRTDTTTDKAKAGTTTTGATRGAGAKAGATTTWTATGTAAATA 843 

Qy ■ 564 tacaagcccatatacaaataattatataaatattcattaaaaaatatatttaaatatagg 623 

II ::HI II III II: :| II II :hll hll hi I ::: 
Db 842 ATDTAG--WWATAAAATTAAAAARAAWTWTTTTWATATWAWAAATTRTTAAWAAARWDR 786 

Qy 624 atataaatataactattttagaattattctactttaagataacataggttaaatgtataa 683 

I hi I I II I II : I I I III I :l :l -I I h 

Db 785 TAAAAWAATATAATTRTTAAAAAAWTATTTTTTATAAATAATTTWAWTWTTWWTTTTTWG 726 

Qy 684 ttaataaggttagtttattgtaaagatgagtatatatgtcgtaaacataatcactaacca 743 

I : :| :|:|: I llh : I II I : I I I 
Db 725 ATAAWWAWAAAAAAKAWTAKTRARATAAATTATWATWATTATATATWATTTTTATTTTTA 666 

Qy 744 tttttattaacttcttggttttgaagttccaaaaagaaaatggaagggaaatttgagagt 803 

h :ll ::l II I II : I I II lllll I III II : : 
Db 665 TWAWTAADWATTTAATAAATATAWTAATTTATTAATAAAATAWATWAAAAAATTTTARAD 606 

Qy 804 aagttcatgtttatattatacataatgaagttgatgttttcttctttttaatatttttat 863 

I I: :| :l hi III Mil :| I llhl :lhll 
Db 605 ATTAAATTAWAAWTTAWAATAAAWAATTAWTGAAAATTTTWWTTATATTAWTTWTTKTAA 546 , 

Qy 864 acaaaatatttaaataaaataattaaggattgaatgaaaaatataatgaaagtcgtttta 923 

llllhll :| III II :lh II : I ::| I II III h ' 
Db 545 KTAAAATRTTAWATTAATTTATWTAWATTTTWWTTTTTTWWTTTTATTAAAATNAAACAA 486 

Qy 924 ctaatagtcatattgcatttt 944 

II II I III 

Db 485 NAAAANNGCTTCCCCCCNTTT 465 



Search completed: September 2, 2000, 22:59:09 
Job time; 19365 sec 
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filters for hybridization from the BACPAC Resource Center can be 
found at http://bacpac.med.buffalo.edu/drosophilaj5ac.htiii, 
FEATURES Location/Qualifiers 
source 1. ,1101 

/organism- " Drosophila melanogaster " 
/db_xref- n taxon : 7227 " 
/clone_lib-"RPCI-98" 
/clone-'BACR29P01" 
/note- "end : TET3" 
366 a 66 c 104 g 351 t 214 others 



BASE COUNT 
ORIGIN 



Query Match 3.5%; Score 107.4; DB 122; Length 1101; 

Best Local Similarity 38.6*; Pred. No. 6e-07; 

Matches 230; Conservative 90; Mismatches 273; Indels 3; Gaps 1; 
Qy 2081 attataaattccattcttctattttactaagatattagtaacttcaaactgctgattttt 2140 

::: I II |:: : :| |:|: l|:|| I II : I : I I 
Db 1041 MWWTTTTTTTTTTTWMAWAYACAMMAYTWTTTTAWTATTTTTTTTTTMAATATYCWATAT 982 

Qy 2141 actaatttattatttataaattgttagaatgattatttttcaataatttaacaacaatat 2200 

I: :: : I :: I III :|l III I I MUM II II I 
Db 981 TTTMAATACAWWAYATTTWWTATACATAATWWTTTTTTATATACAATTTWAAAATAAAAA 922 

Qy 2201 ttaatattattattattattatttctcaatttttattaaacaaaaacataaatttttgac 2260 

;IM :: III : : |;| I I ::| III :| I |: ;| |::| I 
Db 921 WTMCWWAATTTAWAMWCATTWTTTTTMWWTMTTTWATTACAWTWTTAWWTAAAAAA 862 

Qy 2261 aaattaaaataaatgaattaatttctcaatttttcgtgcaactattacaaaaatccttca 2320 

I :IHII:|II|: II :: II : I II : : I :: I 1 : 1 1 II:: 

Db 861 ATWTTAAAWTAAAWAAAAARWATTAWAATTIAYATWATATWTAAAWWTATAWATTATTMW 802 

Qy 2321 tagtcctaatcttaatttgatgcagaggtgataataatcttaatttgatgcagaggtaat 2380 

:| I II III I I I I MM: || : I III 

Db 801 WMTWTATAWNATTTTTTTTWTAWAAWTWTTWATWAWTAATTTTAAWTWWTTAAATAAA 742 

Qy 2381 aatgggccgggtttgagctggacttaagcatgatattgacgtactttatatttttccaaa 2440 

II III : I Ml I Ml : I Mill :| M 

Db 741 AAAAATTTATTTTTATTTWTTATTWAAAWTTTTWTTTTWAATTWWYTTTAATAWTTAAAW 682 

Qy 2441 ttcaacccagctcgaaatatgagtctaaaattttgtccaatttaatccaagcccatttta 2500 

I: I II 11:111 I : III M |:| || |: :: 
Db 681 TWTTAAAWAWWT — AAWTTAAATAAATWTATTTAAAATWrWAAWTATAAATTTAWADWT 625 

Qy 2501 agttcgtccatattattttttaatttaaaaaatttatatcattttattttaatatttaat 2560 

I I II:: :M: I I Mil Mill Ihllll llll:::ll 
Db 624 TATAWWTTTTTTTWWCWTTWAATATATAAAATWAAAAATTAAATTOTTTTATATWWWAT 565 

Qy 2561 tattttatatattttttatttattgaaaatttttatatagtcatcttaacattatgttaa 2620 

I ll:::|| lllll :: I 1 1 : : I I I II I I I M 1 1 I 
Db 564 AAAAAMATWWWTTATTTATAAWWAATTATTTAWAWTTGATTTTTATTTATTTWTTTTTTA 505 

Qy 2621 tgtttatattagagtagtattatatatatttagtataggtttattttgttaataaa 2676 

:|| lllll Mill 11:1 :| I I III I I II II: 
Db 504 AAWTTTTATTATATWATTTATTAATWTTANTTATYTTTTTTTTATATCTTTCMAAW 449 



RESULT 13 
AQ026918 

LOCUS AQ026918 890 bp DNA 
DEFINITION CIT-HSP-2322B22.TF CIT-HSP Homo 

genomic survey sequence, 
ACCESSION AQ026918 



GSS 30-JUN-1998 
genomic clone 2322B22, 



VERSION AQ026918.1 GI:3267140 
KEYWORDS GSS. 

SOURCE human. 
ORGANISM Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 890) 

Adams, M.D., Rounsley,S.D., Zhao,S., Field, C.E., Bass,S., Linher,K,, 



Golden, R., Berry, R., Granger, D,, Suh,E., Wible,C, Shizuya,H,, 

Simon, M. and Venter, J. C. 
TITLE Use of a random BAC End Sequence Database for Sequence-Ready Map 

Building (1998) 
JOURNAL Unpublished (1998) 
COMMENT Contact: Mark Adams 

Department of Eukaryotic Genomics 

The Institute for Genomic Research 

9712 Medical Center Dr., Rockville, MD 20850, USA 

Tel: 301 838 0200 

Fax: 301 838 0208 

Email: mdadamsStigr.org 

Clones are available from Research Genetics (info8resgen.com). BAC 
end search page: 

http : //www , tigr . org/tdb/humgen/bac_end_search/bac_end_search , html . 
Seq primer: M13-21 
Class: BAC ends. 
FEATURES Location/Qualifiers 
source 1. .890 

/organism-'Homo sapiens" 

/dbjcref-"taxon:9606" 

/clone-"2322B22" 

/clone.lib-"CIT-HSP" 

/sex- "Male" 

/cell.type-'Sperm" 

/note- "Vector: pBeloBACll; SiteJ: Hindlll; SiteJ: 
HindlH" 

204 a 71 c 44 g 571 t 



BASE COUNT 
ORIGIN 



Query Match 3,5%; Score 107.2; DB 93; Length 890; 

Best Local Similarity 49.04; Pred. No. 6.6e-07; 

Matches 344; Conservative 0; Mismatches 353; Indels 5; Gaps 2; 

Qy 1964 tcaatgaatcgatttcaattttcgcagtataagttccttttaatcctttctttttacttc 2023 

iiiii mi mi i mi n inn i in m i n -• 

Db 14? TTATTAATTTTATTTTTTTTTTATTATTATATATTTTTTTTATTTTTTTTATTTAATTTA 206- 

Qy 2024 attttataacgaattctatggataatgttccctacaaacatgtcattacaatgtttaatt 2083 

llll I I II I II II II I I I III II III III 
Db 207 TTTTTTTTTATATTTTT — ATTATTTTATTTTTATTTTTTAAATTTATATTTTTTATT 262 

Qy 2084 ataaattccattcttctattttactaagatattagtaacttcaaactgctgatttttact 2143 

III I lllll lllll I I llll lllll II llllll I 
Db 263 ATATTTATTTTTTTTATTTTTTATTTATTTATTTATTATTTTATTTTATTTATTTTTTAT 322 

Qy 2144 aatttattatttataaattgttagaatgattatttttcaataatttaacaacaatattta 2203 

I I I llllll II III II II lllll III II I I II III 
Db 323 TTTATTTATTTTATATTTTTTTA-TATTTTTTTTTTTTAATTTTTATATTATTATTTTTT 381 

Qy 2204 atattattattattattatttctcaatttttattaaacaaaaacataaatttttgacaaa 2263 

llll lllll II II II I I MINIM I I I II II I 
Db 382 ATATATTTATTTTTTTTTATTTTTATTTTTTATTATATTTTATATTTTTTTATTTATTTT 441 

Qy 2264 ttaaaataaatgaattaatttctcaatttttcgtgcaactattacaaaaatccttcatag 2323 

II II I II II I I II I I II II III II III 
Db 442 TTTTTTTATTTATTTTTTTTATTATTTATTATTTTATTATTATATAATAATATTTTATAT 501 

Qy 2324 tcctaatcttaatttgatgcagaggtgataataatcttaatttgatgcagaggtaataat 2383 
I I I lllll II III llll I III II I 

Db 502 TATTTAATATATTTTTTATTTTATTTTATATTAATTATTATTATATATTTTAATTTATTA 561 

Qy 2384 gggccgggtttgagctggacttaagcatgatattgacgtactttatatttttccaaattc 2443 

III I I il II llllll Ml I III 

Db 562 TTTTTTATTTTAATTTATTTTTTTTTTATTATTTTTTTTATTATTTTTATTTTTATATTT 621 

Qy 2444 aacccagctcgaaatatgagtctaaaattttgtccaatttaatccaagcccattttaagt 2503 

II I III I I I III I I I I II lllll I 
Db 622 TTTTTAATTTTTATTATTTATAATATAATTTTTTTATATAATTCATTATTAATTTTTATA 681 

Qy 2504 tcgtccatattattttttaatttaaaaaatttatatcattttattttaatatttaattat 2563 

i i iiminim llll II III III III I II I I II 
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Magnoliophyta; eudicotyledons; Rosidae; eurosids II; Malvales; 

Malvaceae; Gossypium. 
REFERENCE 1 (bases 1 to 921) 
AUTHORS Leslie, A., Frisch,D., Yu,Y., Wood,T.C, Wing,R.A. and wilkins,T.A. 
TITLE An integrated analysis of the genetics, development, and evolution 

of the cotton fiber 
JOURNAL Unpublished (2000) 
COMMENT Contact: Wing RA 

Clemson University Genomics Institute 

Clemson University 

100 Jordan Hall, Clemson, SC 29634, USA 
Tel: 864 656 7288 
Fax: 864 656 4293 
Email: rwingUclemson.edu 
High quality sequence stop: 921. 
FEATURES Location/Qualifiers 
source 1. .921 

/organism-'Gossypium arboreum" 

/strain- "AKA" 

/cultivar-"8400" 

/dbjcref-'taxon: 29729" 

/clone-"GA_Ea0023O23" 

/clonejib- "Gossypium arboreum 7-10 dpa fiber library" 
/tissue.type- "Fibers isolated from bolls harvested 7-10 
dpa" 

/lab_host-"E. coli" 

/note- "Vector: pBR-CMV; SiteJ: EcoRI; SiteJ: Xhol" 
BASE COUNT 265 a 185 c 190 g 281 t 
ORIGIN 



Query Match 3.7%; Score 113.8; DB 80; Length 921; 

Best Local Similarity 67.6%; Pred. No. 7.2e-08; 

Matches 209; Conservative 0; Mismatches 87; Indels 13; Gaps 3; 

Qy 952 ctacttaaataatagataaattaattgtggtacattagatcaaagaacaaactagatttt 1011 

i! I ii i i mill i mi iiiiiiii!! i mi i i n 

Db 715 CTTTATTAAAATTGAGAAAATTAGTCCCTGTACGTTAGATCAAACAGCAAATTAAACATT 656 
Qy 1012 gtcccattctattgttaaaagctggtccgtttacattaaaataaggtacatgttacatgc 1071 

ii iiMii iiiiii i inn ii Minimi i mil 

Db 655 TTATTAAAAATTGGTCCTTGTACATCAACATAAGGTACACATGGCATGC 607 

Qy 1072 cacgtataactatctggttattctatcaatcacgctaatttttaacagtagaaatgaatg 1131 

III I M IIIIIIII Mil II I INI I II II Mill III 

Db 606 TACGAGTCACTATCTAGTTATTCTGTCAACCATACCGGTTTTAACCAATATAAATGGATG 547 
Qy 1132 taatttttaaatagaaagggtcaaattgttatttgatctaacacgtagggattaatttac 1191 

minim ii iii iii iii i iii iiiiii 1 1 inn mm 

Db 546 GAATTTTTAACAAG-AAGAACCAATTTGCTCTTTAATCTAATATGCAGGGACTAATTTGT 488 
Qy 1192 ttattttcctaaagaaataagtaaaatataatttgaatcttaatacaaaaactttcatga 1251 

mm i i i i i inn m m n n mm mi m 

Db 487 CTATTTT-ATGAGTATAGGGGCAAAATGCAATCTGACTCCTAGTACAAAGACTTCCATAG 429 



Qy 1252 tacttttat 1260 

III lllll 
Db 428 TACCTTTAT 420 



RESULT 10 
CNS003BD/C 

LOCUS CNS003BD 1101 bp DNA GSS 03-JUN-1999 

DEFINITION Drosophila melanogaster genome survey sequence TET3 end of BAC * 

BACR08K08 of RPCI-98 library from Drosophila melanogaster (fruit 

fly), genomic survey sequence. 
ACCESSION AL064091 
VERSION AL064091.1 61:4941847 
KEYWORDS GSS, 
SOURCE fruit fly. 
ORGANISM Drosophila melanogaster 

Eukaryota; Meta2oa; Arthropoda; Tracheata; Hexapoda; Insecta; 

Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; 



Muscomorpha; Ephydroidea; Drosophilidae; Drosophila, 
1 (bases 1 to 1101) 
AUTHORS Genoscope, 
TITLE Direct Submission 

JOURNAL Submitted (02-JUN-1999) Genoscope ■ Centre National de Sequencage : 
BP 191 91006 EVRY cedex - FRANCE (E-mail : seqref?genoscope. cns.fr 
• Web : ww.genoscope.cns.fr) 
COMMENT Determination of this BAC-end sequence was carried out as part of a 
collaboration with the Berkeley Drosophila Genome Project (BDGP). 
The BDGP is constructing a physical map of the Drosophila 
melanogaster genome using these BACs. For further information 
please see http://www.fruitfly.org The BDGP Drosophila 
melanogaster BAC library was prepared by Kazutoyo Osoegawa and 
Aaron Mammoser in Pieter de Jong's laboratory in the Department of 
Cancer Genetics at the Roswell Park Cancer Institute in Buffalo, 
NY. The library is named RPCI-98 and was constructed by partial 
EcoRI digestion of Drosophila DNA provided by the BDGP from the 
isogenic strain y2; cn bw sp, the same strain used for the BDGP's 
Pi and EST libraries . A more detailed description of the library 
and how to order individual BAC clones, the entire library, or 
filters for hybridization from the BACPAC Resource Center can be 
found at http://bacpac.med, buffalo.edu/drosophila_bac. htm. 
FEATURES Location/Qualifiers 
source 1, .1101 

/organism- n Drosophila melanogaster" 
/db_xref-"taxon:7227" 
/clone_lib-"RPCI-98" 
/clone- "BACR08K08" 
/note- "end : TET3" 
BASE COUNT 395 a 120 c 103 g 334 t 149 others 
ORIGIN 



Query Match 3.6%; Score 109.6; DB 122; Length 1101; 

Best Local Similarity 40.54; Pred. No. 2.9e-07; 

Matches 247; Conservative 73; Mismatches 289; Indels 1; Gaps'*.' 

Qy 371 tcttcacttttgaactttgataagtcaccaaactttaacaaagtttgattgtgtacatat 43CF* 

I II MM : ! M: I : : : II : : I : III t I 
Db 1056 TKTTTTTTKWTATRWATAKAWTTTTTWGTRTWTTTRTWWTWTWNATTTWATTTTATTTTT 997" 

Qy 431 atatatatatcttcaaattttataataaaaattgtgtttaaataatttacagttatatta 490" 

I I ::lll I I : I: I III III I hill ::MIM I; :| M I 

Db 996 TTTTTWWTATTATAATWATWATATAAAAATATTWTWTWTAATWRWTTTWTARAKAAAWAA 937 

Qy 491 tttttttatctctaattttatttgtcgccaaatttttagttgatattttaacataaaaaa 550 

m mm i ii m mi i : mn 11 i = :: mm i 

Db 936 WTAWTTWATTTTTATWTTAATTTTTTTTTWTTAWTTTAWTTATTTWAAAWWTATWAWATA 877 
Qy 551 aattgtacacatttacaagcccatatacaaataattatataaatattcattaaaaaatat 610 

mi i iii ii m mm : mi: i m 

Db 876 WATTATTTTTATTATTTTTTTWTTTTTATTWTANAATWTTTAATKAWATTTAAWTATWAA 817 

Qy 611 atttaaatataggatataaatataactattttagaattattctactttaagataacatag 670 

1 1 ! 1 1 1 1 :l I II: : lllll II: I: III II MM: ||: 
Db 816 ATTTAAAWAAAWTATWNWNATATATAAATWATWATATTTTTAATATAAATAAAAWTATWT 757 

Qy 671 gttaaatgtataattaataaggttagtttattgtaaagatgagtatatatgtcgtaaaca 730 

II Mil II: II ill I II :|: IIIIIIII 

Db 756 TTTTTTAATATATTTWTAAAWAWTATATATTTAWAWTTTATAATTTTTTTTTTATATTTT 697 

Qy 731 taatcactaaccatttttat-taacttcttggttttgaagttccaaaaagaaaatggaag 789 

I I I llll Ml:: : | || HI Ml I 

Db 696 TTTTTTTTTTTTTTTTTAATWTTTNWTAAATAAAAWTTATTTAAAAATTAAWAAANATAW 637 

Qy 790 ggaaatttgagagtaagttcatgtttatattatacataatgaagttgatgttttcttctt 849 

MM I I II I MMI III llll II II 
Db 636 AAAAAWTAAAAAAAAAAAAAAAAAAAAAAAAATKWTTTTTTTTTTTTTTTTTTTTTTTTT 577 

Qy 850 tttaatatttttatacaaaatatttaaataaaataattaaggattgaatgaaaaatataa 909 

III I lllllll I II |:|::|| III:! II M I M II 
Db 576 TTTTTTTTTTTTATKTATAAAAKTWWAAAAAAWTTTTTTTTTTTWTTKTTTTTTWTTTAT 517 
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please see http://www. fruitfly.org The BDGP Drosophlla 
melanogaster BAC library was prepared by Kazutoyo Osoegawa and 
Aaron Mammoser in Pieter de Jong's laboratory in the Department of 
Cancer Genetics at the Roswell Park Cancer Institute in Buffalo, 
NY. The library is named RPCI-98 and was constructed by partial 
EcoRI digestion of Drosophlla DNA provided by the BDGP from the 
isogenic strain y2; cn bw sp, the same strain used for the BDGP's 
PI and EST libraries. A more detailed description of the library 
and how to order individual BAC clones, the entire library, or 
filters for hybridization from the BACPAC Resource Center can be 
found at http://bacpac.med.buffalo.edu/drosophilaJbac.htm, 
FEATURES Location/Qualifiers 
source 1. .1101 

/organism-'Drosophila melanogaster" 

/dbjtref-"taxon:7227" 

/clone_lib-"RPCI-98" 

/clone- "BACR08K08" 

/note- 'end : TET3" 
BASE COUNT 395 a 120 c 103 g 334 t 149 others 



Query Match 3.9%; Score 118.4; DB 122; Length 1101; 

Best Local Similarity 42.2%; Pred. No. 1.5e-08; 

Matches 234; Conservative 61; Mismatches 260; Indels 0; Gaps 

Qy 383 aactttgataagtcaccaaactttaacaaagtttgattgtgtacatatatatatatatct 442 

I llll Mil : I II: : II III II I :l I : I 
Db 510 ACCTTTMTAMWAAAAAAMAAWAAAAAAAAAAAWTTTTTTWWAMTTTTATAMATAAAAA 569 

Qy 443 tcaaattttataataaaaattgtgtttaaataatttacagttatattatttttttatctc 502 

Db 570 AAAAAAAAaIa^ 629 

Qy 503 taattttatttgtcgccaaatttttagttgatattttaacataaaaaaaattgtacacat 562 

: MM I I HUH | |: III |: II: I I I 

Db 630 AWTTTTTMTATNTTTWTTAATTTTTAAATAAWTTTTATTTAWNAAAWATTAAAAAAAAAA 689 

Qy 563 ttacaagcccatatacaaataattatataaatattcattaaaaaatatatttaaatatag 622 

I II Mil III II llllll: Mill MM I MIMIM 
Db 690 AAAAAAAAAAATATAAAAAAAAAATTATAAAWTWTAAAIATATAWTWTTTAWAAATATAT 749 

Qy 623 gatataaatataactattttagaattattctactttaagataacataggttaaatgtata 682 

I I 1 1 1 : M I : I I II llll |: : | | HI : :| I h 
Db 750 TAAAAAAAWATAWTTTTATTTATATTAAAAATATWATWATTTATATATNWNWATAWTTTW 809 

Qy 683 attaataaggttagtttattgtaaagatgagtatatatgtcgtaaacataatcactaacc 742 

llll II II I I hi I : I : I II: I II I 
Db 810 TTTAAATTTWATAWTTAAATWTMATTAAAWATTNTAWAATAAAAAWAAAAAAATAATAAA 869 

743 atttttattaacttcttggttttgaagttccaaaaagaaaatggaagggaaatttgagag 802 

I I Ml : : : II: I : II: I Ml II II: I II 
870 AATMTWTATWTWATAWWTTTWAMTAAWTAAAWTMAAAAAAAAATTAAWATAAAAAT 929 

803 taagttcatgtttatattatacataatgaagttgatgttttcttctttttaatattttta 862 

Ml I I MM I : I ::MI |: : I llll |: Ml I II 
930 WAAWTAWTTWTTTMTYTAWAAAWYWATTAWAWAWAATATTTTTATATWATWATTATAATA 989 

863 tacaaaatatttaaataaaataattaaggattgaatgaaaaatataatgaaagtcgtttt 922 

:: llll I MIMIM::: ||: I M I IM I : : 
990 WWAAAAAAAAAATAAMTWAAATNWAWAWWAYAMWAYACWAAAAAWTMTATWYATAWMA 1049 



923 actaatagtcatatt 937 

I II I : I III 
1050 AAAAAMAAHTANATT 1064 



RESULT 7 
CNS003DQ/C 

LOCUS CNS003DQ 1101 bp DNA GSS 03-JUN-1999 

DEFINITION Drosophila melanogaster genome survey sequence TET3 end of BAC # 
BACR08I09 of RPCI-98 library from Drosophila melanogaster (fruit 
s fly), genomic survey sequence. 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 
ORGANISM 



TITLE 
JOURNAL 



FEATURES 

source 



BASE COUNT 
ORIGIN 



AL064580 

AL064580.1 GI:4941932 
GSS. 

fruit fly, 

Drosophila melanogaster 

Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 
Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; 
Muscomorpha; Ephydroidea; Drosophilidae; Drosophila. 
1 (bases 1 to 1101) 
Genoscope. 
Direct Submission 

Submitted (02-JUN-1999) Genoscope ■ Centre National de Sequencage 
BP 191 91006 EVRY cedex • FRANCE (E-mail : seqref8genoscope.cns.fr 
- Web : www.genoscope.cns.fr) 

Determination of this BAC-end sequence was carried out as part of 
collaboration with the Berkeley Drosophila Genome Project (BDGP), 
The BDGP is constructing a physical map of the Drosophila 
melanogaster genome' using these BACs, For further information 
please see http://www.fruitfly.org The BDGP Drosophila 
melanogaster BAC library was prepared by Kazutoyo Osoegawa and 
Aaron Mammoser in Pieter de Jong's laboratory in the Department of 
Cancer Genetics at the Roswell Park Cancer Institute in Buffalo, 
NY. The library is named RPCI-98 and was constructed by partial 
EcoRI digestion of Drosophila DNA provided by the BDGP from the 
isogenic strain y2; cn bw sp, the same strain used for the BDGP's 
Pi and EST libraries. A more detailed description of the library 
and how to order individual BAC clones, the entire library, or 
filters for hybridization from the BACPAC Resource Center can be 
found at http://bacpac.med.buffalo.edu/drosophila.bac.htm. 

Location/Qualifiers 

1. .1101 

/organlsm-"Drosophila melanogaster" 
/db_xref-"taxon:7227" 
/clone_lib-"RPCI-98 n 
/clone-"BACR08l09" 
/note- "end : TET3" 
291 a 51 c 117 g 404 t 238 others 



Query Match 3.8%; Score 115,8; DB 122; Length 1101; 

Best Local Similarity 44.2%; Pred. No. 3.6e-08; 

Matches 253; Conservative 45; Mismatches 272; Indels 2; Gaps 

Qy 402 actttaacaaagtttgattgtgtacatatatatatatatcttcaaattttataataaaaa 461 

I llll I:: M MM M I ! 1 1 : II: :| II hll II 

Db 574 ATTTTACTTTTTTYYYCYTCNYWAYAAAWAAATATWAATMACWTWAAATTTTWATTTAAT 515 

Qy 462 ttgtgtttaaataatttacagttatattatttttttatctctaattttatttgtcgccaa 521 

II I I I I III III: llllll II I : llllll I I - 

Db 514 TTAAATATTTAAATTTTTATTTTTATTCWATTTTTTTTCCCCWTTTTTATATTTTAATWW 455 

Qy 522 atttttagttgatattttaacataaaaaaaattgtacacatttacaagcccatatacaaa 581 

I II M I: I Ml :|: I : II II: : I I : I I llll III 

Db 454 AATTAWATTWAAAATTAWAWWAAWTAATAAWAAAWTAAAAAWAAAATAAAATTATAAAAA 395 

Qy 582 taattatataaatattcattaaaaaatatatttaaatataggatataaatataactattt 641 

II II II lllll II :■ : I I I II M llllllllll I I: I 

Db 394 TATTTTTAAAAATA--AATATWTTWTTTTTATATAAAAWAATATATAAATATTAWAAWAT 337 

Qy 642 tagaattattctactttaagataacataggttaaatgtataattaataaggttagtttat 701 

I: 1:1 I I I II I : : II M llll MM: :|| : M 
Db 336 TWWAWTATATTTTTTATATATTTWTWATTTTTTTWTTTATATTTAWTWTTWWTATAWAWT 277 

Qy 702 tgtaaagatgagtatatatgtcgtaaacataatcactaaccatttttattaacttcttgg 761 

: I II I I II I : II I II Mill llll II 
Db 276 WATTAAATTAATTACAAWAAAAAATAAAAAAAAAAAAAAATATAAAAATAAAATTTAAAA 217 

Qy 762 ttttgaagttccaaaaagaaaatggaagggaaatttgagagtaagttcatgtttatatta 821 

III II I lllll llll II I II I I II I Ml 

Db 216 TTTAAAAAATAAAAAAAAAAAAAAAAAAAMCATAAAAAAAAAAAAAAAAAAAAAAAAAA 157 

Qy 822 tacataatgaagttgatgttttcttctttttaatatttttatacaaaatatttaaataaa 881 
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Aaron Mammoser in Pieter de Jong's laboratory in the Department of 
Cancer Genetics at the Roswell Park Cancer Institute in Buffalo, 
NY, The library is named RPCI-98 and was constructed by partial 
EcoRI digestion of Drosophila DNA provided by the BDGP from the 
isogenic strain y2; cn bw sp, the same strain used for the BDGP's 
PI and EST libraries. A more detailed description of the library 
and how to order individual BAC clones, the entire library, or 
filters for hybridization from the BACPAC Resource Center can be 
found at http://bacpac.med, buff alo.edu/drosophilajiac. htm. 
FEATURES Location/Qualifiers 
source 1. .1101 

/organism-"Drosophila melanogaster" 
/dbjcref-"taxon:7227" 
/clone.lib-»RPCI-98 1 ' 
/clone-"BACR29B23" 
/note- "end : T7" 
419 a 91 c 60 g 299 t 232 others 



BASE COUNT 
ORIGIN 



Query Match 4.2%; Score 127.6; DB 122; Length 1101; 

Best Local Similarity 36.5%; Pred. No. 7.1e*10; 

Matches 241; Conservative 133; Mismatches 285; Indels 2; Gaps 

Qy 312 cacatatcacattttgagtttgtatgatgataagtcgacataancgaaatatggtgtgat 371 

I::: : :|:||:| :|: : : : I I I 1 : 1 1 I -:| h 
Db 442 CMMMMMMHAMATYTCTCAHTWTMMMMMMWWAATWTWWAAAWAAAWTTATWAATWAAAAAW 501 

Qy 372 cttcacttttgaactttgataagtcaccaaactttaacaaagtttgattgtgtacatata 431 

:: : Mil :: ::| 1 1 : I :' : : :||| III : II ::| 
Db 502 AWWWWATTTTTWWWTOWATTWTTTWAWWTWTAWTAAAAAAAAAWATAATTTAAAWWAAT 561 

Qy 432 tatatatatcttcaaattttataataaaaattgtgtttaaataatttacagttatattat 491 

: II : II hi ::|lll III : I hhl Mil II I Mill I 
Db 562 AWATTAAWAATTTAWAAWWTATATTAATWTATAAATWrWATTAATATAAAAAAATATTTT 621 

Qy 492 ttttttatctctaattttatttgtcgccaaatttttagttgatattttaacataaaaaaa 551 

II: I I I I I MM III : : I::: I llll II 

Db 622 TTWATAAAATTTTTAATAATTTAATTAWTTATTAAATAAWTMATTWWWTAATTAAATAAT 681 

Qy 552 attgtacacatttacaagcccatatacaaataattatataaatattcattaaaaaatata 611 

I: I I I I II I 1:1 :lhllhllhl I HI I -\\ II hi 
Db 682 TTWAMTAWAAAAAAAAAAAAAAAWATWAAWAATWATAWATAAWTTAAAAWAATAAAAWA 741 

Qy 612 tttaaatataggatataaatataactattttagaattattctactttaagataacatagg 671 

: I : ::| llll::|||| I :||:|| :: |:: :: I :l hll hi 
Db 742 AWAATWAWWATAATATMWATAT-ATAWTTWTAWWWATWWAWWWTATAWATAWAATAWAAW 800 

Qy 672 ttaaatgtataattaataaggttagtttattgtaaagatgagtatatatgtcgtaaacat 731 

:hll Ml: lh : :| : : :| II : 1 1 1 : 1 : I : I 
Db 801 AWAWATAAATAWATAWATWAAAWAWAWATAWWATWATATAWWAATAWAWAAAAAATWTAA 860 

Qy 732 aatcactaaccatttttattaacttcttggttttgaagttccaaaaagaaaatggaaggg 791 

II I hi I: I lh :l llll :l : llh :| h h 
Db 861 TATWMTWATAAWAAAAMTAWAWTTWWTWTTTTWWAWWATATAAAWAWATAMAAAWAAA 920 

Qy 792 aaatttgagagtaagttcatgtttatattatacataatgaagttgatgttttcttctttt 851 

III I I h : I I : I II h :l : I : h: h I : 
Db 921 AAAAAAAAAATAAWAWWWTWATATTWTTATTAAAKTWTATWWATTWATTWWAWTWTATAW 980 

Qy 852 taatatttttatacaa-aatatttaaataaaataattaaggattgaatgaaaaatataat 910 

I :|lhl llll :: hlh II II I llh: I II h h llllll : 
Db 981 TTWTATWTATATATWWTAWTAWATATATTTATTAAWWTATATTTTAWAAAWTAATATATM 1040 

Qy 911 gaaagtcgttttactaatagtcatattgcattttgtcgcatctacttaaataatagataa 970 

I: : : I :|::| I |: I hi h I :|| :|:| |::| 
Db 1041 ATAWWWTAWATATAWAWWAATTAWTTATATATWTAAWATAWWAAAAAWAAAWAWATAWWA 1100 

Qy 971 a 971 

I 

Db 1101 A 1101 



RESULT 4 
B12681 

LOCUS B12681 804 bp DNA GSS 14-MAY-1997 

DEFINITION F27D1-Sp6.1 IGF Arabidopsis thaliana genomic clone F27D1, 

genomic survey sequence. 
ACCESSION B12681 



VERSION 



SOURCE 
ORGANISM 



TITLE 
JOURNAL 
COMMENT 



FEATURES 
source 



BASE COUNT 
ORIGIN 



B12681.1 61:2093801 
GSS. 

thale cress. 
Arabidopsis thaliana 

Eukaryota; viridiplantae; Embryophyta; Tracheophyta; Spermatophyta; 
Magnoliophyta; eudicotyledons; Rosidae; eurosids II; Brassicales; 
Brassicaceae; Arabidopsis. 
1 (bases 1 to 804) 

Feng, J., Dewar,K., Buehler,E., Kim,c, Li,Y., Shinn.P., Sun,H. and 
Ecker,J. 

BAC End Sequences at ATGC 
Unpublished (1997) 

On Dec 15, 1999 this sequence version replaced gi:4123046, 
Other.GSSs: F27D1-Sp6.2, F27Dl-Sp6 
Contact: Ecker J. 

Arabidopsis Thaliana Genome Center 
University of Pennsylvania 

Dept. of Biology, University of Pennsylvania, Philadelphia, PA 
19104 

Tel: 215-898-9384 
Fax: 215-898-8780 

Email : jecker (totgenome . bio , upenn . edu 
Seq primer: Sp6 
Class: BAC ends 

High quality sequence start: 335 
High quality sequence stop: 346. 

Location/Qualifiers 

1. .804 

/organism-"Arabidopsis thaliana" ^ 
/strain- "Columbia" 

/dbjcref-"taxon:3702" y 

/Clone-"F27Dl" 

/clone_lib-"IGF" 

/sex- "hermaphrodite" 

/note- "Vector: BeloBACII; Site.l: EcoRI; Site J: EcoRI; 
Produced by Thomas Altmann" 
241 a 14 c 20 g 439 t 90 others 



Query Match 3.9%; 
Best Local Similarity 49.8%; 
Matches 314; Conservative 



Score 120; DB 120; 
Pred. No. 9.3e-09; 
3; Mismatches 311; 



Length 804; 
Indels 6; 



Qy 2046 taatgttccctacaaacatgtcattacaatgtttaattataaattccattcttctatttt 2105 

b 172 ' ' iAAAATTTNNTTTTTTTTm 231 

Qy 2106 actaagatattagtaacttcaaactgctgatttttactaatttattatttataaattgtt 2165 

I II II III I II III I II III I II III III I I II 

Db 232 AATATAATTTTAATTTTTTAAAAAATTTTTTTATTATTTTTTNATTNTTTTTTTNATTTT 291 

Qy 2166 agaatgattatttttcaataatttaacaacaatatttaatattattattattattatttc 2225 

I II II II II Mill I I II llllll II I I llll 

Db 292 TNNTTTTTATTAAAATTTTTAAATNTATATTAAAAAATTATTTTTTTAAAAATTTT 347 

Qy 2226 tcaatttttattaaacaaaaacataaatttttgacaaattaaaataaatgaattaatttc 2285 

III II III llll I I llll I I I III II I Hill 

Db 348 TAATTNAAAATAAAAAAAAAAATTTATTTTTATTAATAATTAAAAAANTTNTATAATTAT 407 

Qy 2286 tcaatttttcgtgcaactattacaaaaatccttcatagtcctaatcttaatttgatgcag 2345 

i inn i n n ii i ii 1 1 1 i i ii i i 

Db 408 TTTTTTTTTTTTTTAAANNNTATTAANAAAATTTANAATTTTTTNAATTTTTNNAAATAA 467 

Qy 2346 aggtgataataatcttaatttgatgcagaggtaataatgggccgggtttgagctggactt 2405 

I I I I I HUH I I I I I III I 
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gb_gssl3 
gb_gssl4 
gb_gssl5 
gb_gssl6 
gb_gssl7 
gb_gssl8 
gb_gssl9 



SUMMARIES 



% 

Query 

Score Match Length DB ID 



Description 



133.6 4 


,4 1101 


122 


CNS00EVL 


AL069706 Drosophil 


129.6 4 


.3 1101 


122 


CNS0021J 


AL061936 Drosophil 


127.6 4 


.2 1101 


122 


CNS00EVL 


AL069706 Drosophil 


120 3 


.9 804 


120 


B12681 


B12681 F27D1-Sp6.1 


119.4 3 


.9 1101 


122 


CNS0039G 


AL063921 Drosophil 


118.4 3 


.9 1101 


122 


CNS003BD 


AL064091 Drosophil 


115.8 3 


.8 1101 


122 


CNS003DQ 


AL064580 Drosophil 


115.2 3 


.8 1101 


122 


CNS00EO7 


AL069440 Drosophil 


113.8 3 


.7 921 


80 


AW727018 


AW727018 GA_Ea002 


109.6 3 


.6 1101 


122 


CNS003BD 


AL064091 Drosophil 


108 3 


.5 1187 


120 


B11102 


B11102 F19C22-T7 I 


107.4 3 


.5 1101 


122 


CNS0OEO7 


AL069440 Drosophil 


107.2 3 


.5 890 


93 


AQ026918 


AQ026918 CIT-HSP-2 


105.6 3 


.5 1101 


122 


CNS00BO1 


AL057419 Drosophil 


105 3 


.4 1101 


122 


CNS00EPO 


AL069493 Drosophil 


104.8 3 


.4 1101 


122 


CNS00LT2 


AL078714 Drosophil 


102.6 3 


.4 893 


122 


CNS013XE 


AL103436 Drosophil 


102.6 3 


\ 1101 


122 


CNS0039G 


AL063921 Drosophil 


101.4 3 




122 






101 3 




120 


B11102 


B1U02 F19C22-T7 I 


100.8 3 




117 


AQ897537 


AQ897537 HSJ153J 


100.6 3 




122 


CNS00FYG 


AL071206 Drosophil 


100.2 3 




123 


CNS0161D 


AL106171 Drosophil 


100 3 




122 


CNS0021J 


AL061936 Drosophil 


99.8 3 




123 


CNS0167M 


AL106396 Drosophil 


99.2 3 




122 


CNS00DKY 


AL071865 Drosophil 


98.4 3 




79 


AW683426 


AW683426 NFQ11G12L 


98 3 




122 


CNS00ITT 


AL075432 Drosophil 


97.8 3 


'2 804 


120 


B12681 


B12681 F27D1-Sp6.1 


97.4 3 


.2 836 


122 


CNS01100 


AL099642 Drosophil 


97.4 3 


.2 836 


122 


CNS01100 


AL099642 Drosophil 


97.4 3 


.2 935 


120 


B10881 


B10881 F24H6-Sp6.1 


96.6 3 


.2 1101 


122 


CNS00EQL 


AL069526 Drosophil 


95.6 3 


.1 990 


122 


CNS006OI 


AL065624 Drosophil 


95.6 3 


.1 1101 


122 


CNS003BB 


AL064089 Drosophil 


95.2 3 


.1 963 


122 


CNS00A4L 


AL054918 Drosophil 


95.2 3 


,1 1225 


123 


CNS0161D 


AL106171 Drosophil 


95 3 


.1 893 


122 


CNS013XE 


AL103436 Drosophil 


95 3 


.1 1101 


122 


CNS003B4 


AL064082 Drosophil 


94.8 3 


.1 1248 


120 


B11336 


B11336 F19M10-Sp6 


93.4 3 


,1 1006 


122 


CNS0O8I3 


AL051903 Drosophil 


93.2 3 


.1 828 


113 


AQ739398 


AQ739398 HSJ482J 


93 3 


.1 966 


122 


CNS005ZC 


AL061991 Drosophil 


93 3 


.1 1101 


123 


CNS017KE 


AL108152 Drosophil 


92.8 3 


.0 828 


113 


AQ739398 


AQ739398 HS_5482_B 



RESULT 1 
CNS00EVL/C 

LOCUS CNSOOEVL 1101 bp DNA GSS 04-JUN-1999 

DEFINITION Drosophila melanogaster genome survey sequence T7 end of BAG: 

BACR29B23 of RPCI-98 library from Drosophila melanogaster (fruit, 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 
ORGANISM 



AUTHORS 

TITLE 

JOURNAL 



FEATURES 

source 



BASE COUNT 
ORIGIN 



fly), genomic survey sequence. 
AL069706 

AL069706.1 61:4949849 
GSS. 

fruit fly. 

Drosophila melanogaster 

Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta,- 

Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; 

Muscomorpha; Ephydroidea; Drosophil idae; Drosophila. 

1 (bases 1 to 1101) 

Genoscope. 

Direct Submission 

Submitted (02-JUN-1999) Genoscope • Centre National de Sequencage : 
BP 191 91006 EVRY cedex • FRANCE (E-mail : seqref?genoscope.cns.fr 
- web : www.genoscope.cns.fr) 

Determination of this BAC -end sequence was carried out as part of a 
collaboration with the Berkeley Drosophila Genome Project (BDGP). 
The BDGP is constructing a physical map of the Drosophila 
melanogaster genome using these BACs . For further information 
please see http://www.fruitfly.org The BDGP Drosophila 
melanogaster BAC library was prepared by Kazutoyo Osoegawa and 
Aaron Mammoser in Pieter de Jong's laboratory in the Department of 
Cancer Genetics at the Roswell Park Cancer Institute in Buffalo, 
NY, The library is named RPCI-98 and was constructed by partial 
EcoRI digestion of Drosophila DNA provided by the BDGP from the 
isogenic strain y2; cn bw sp, the same strain used for the BDGP's 
PI and EST libraries. A more detailed description of the library 
and how to order individual BAC clones, the entire library, or 
filters for hybridization from the BACPAC Resource Center can be 
found at http://bacpac.med. buff alo.edu/drosophilajac. htm. 

Location/Qualifiers 

1. .1101 

/organism-'Drosophila melanogaster" 
/dbjtref= n taxon:7227" 
/clone_lib-"RPCl-98" 
/clone-"BACR29B23" 
/note="end : T7" 
419 a 91 c 60 g 299 t 232 others _ .. 



Query Match 4.4%; Score 133.6; DB 122; Length 1101; 

Best Local Similarity 41.5%; Pred, No. 9.5e-ll; 

Conservative 100; Mismatches 219; Indels 4; Gaps 



II :H I ::h llhll::: I II II HI :| lll::|||| II 



l|: :| :| I :|:|:: Ihll :| |: :| II 



I I : II I II III I :| hi I I ::|:: llhl 



Matches 


Qy 


2135 


Db 


1068 


Qy 


2195 


Db 


1008 


Qy 


2255 


Db 


948 


Qy 


2314 


Db 


888 


Qy 


2374 


Db 


828 


oy 


2434 


Db 


771 


Qy 


2494 


Db 


711 



:| I I :| |:|||: 



II I I: : II :: II I II 



: II I 



II I I : I: II Mil: I :: |::||:::l 



:H : I I llhl I hll :| :| 



: I: III I 11:1 llhll III 



:| llllll 
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OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/883, 7 95A 

FILING DATE; 27-JUN-1997 

CLASSIFICATION: 435 
ATTORNEY/AGENT INFORMATION: 

NAME: Gravelle, Micheline 

REGISTRATION NUMBER: 40,261 

REFERENCE/DOCKET NUMBER: 7841-062 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (416) 364-7311 

TELEFAX: (416) 361-1398 
INFORMATION FOR SEQ ID NO: 36: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 665 base pairs 

TYPE: nucleic acid 

STRANDEDNESS: Single 

TOPOLOGY: linear 
MOLECULE TYPE: CDNA 
ORIGINAL SOURCE : 

ORGANISM: Homo sapiens 

IMMEDIATE SOURCE: 
CLONE: Rh 32 
-08-883-795A-36 



Query Match 2.54; 
Best Local Similarity 47.1*; 
Matches 296; Conservative 



Score 75.4; DB 4; 
Pred. No. 2.6e-05; 
0; Mismatches 331; 



Length 665; 
Indels 



Qy 370 atcttcacttttgaactttgataagtcaccaaactttaacaaagtttgattgtgtacata 429 

II II I II Mil I II II I III I I I I 

Db 648 ATTTTTTCCTTCTCACTTGGCAAATACAATTCCTGAGATCAATAACCTCGTCTTTTTAAT 589 

Qy 430 tatatatatatcttcaaattttataataaaaattgtgtttaaataatttacagttatatt 489 

I I I llll I I I lllll! I III I III I I Mill 

Db 588 TTTTTCCTCGTCTTTTTAACTATTTATAAAATATTGAATTATAAAATATGTAATTATA-A 530 

Qy 490 atttttttatctctaattttatttgtcgccaaatttttagttgatattttaacataaaaa 549 

II III I! lllll I I I III I I I I I I lllll 

Db 529 ATACTTTAATTATAAAATATGTAATTATAAATACTTTAATTATAAAATATGTAATTATAA 470 

Qy 550 aaattgtacacatttacaagcccatatacaaataattatataaatattcattaaaaaata 609 

I I I II I I I I III llllllll I II II llll llllll 

Db 469 ATACTTTATAAMTATGTAATTATAAAATATGTAATTATAAACATTTTAATTATAAAATA 410 

Qy 610 tatttaaatataggatataaatataactattttagaattattctactttaagataacata 669 

II II! I I II! llll! I II II I II I llll III 

Db 409 TGTAATTATAAACATTTTAATTATAAAATATGTAATTATAAACATTTTAATTATAAAATA 350 

Qy 670 ggttaaatgtataattaataaggttagtttattgtaaagatgagtatatatgtcgtaaac 729 

I III I II I I III II III I I I II I III 

Db 349 TG-TAATTATAAACATTTTAATTATAAAATATGTAATTATAAACATTTTAATTATAAAAT 291 

Qy 730 ataatcactaaccatttttattaacttcttggttttgaagttccaaaaagaaaatggaag 789 

II I lllll II I II I I I II II 

Db 290 ATTTAATTATAAACATTTTAATTATAAAATATTTAATTATAAATATTTTAATTATAAAAT 231 

Qy 790 ggaaatttgagagtaagttcatgtttatattatacataatgaagttgatgttttcttctt 849 

I II I I! II I! I I I! I II II II I I II I 

Db 230 ATTTAATTATAAATATTTTAATTATAAAATATTTAATTATAAATATTTTAATTATAAAAT 171 

Qy 850 tttaatatttttatacaaaatatttaaataaaataattaaggattgaatgaaaaatataa 909 

II I I I III II III! I lllll! II lllll 
Db 170 ATTTAATTATAAATATTTTAATTATAAAATATTTAATTATAAATATTTTAATTATAAAAT 111 

Qy 910 tgaaagtcgttttactaatagtcatattgcattttgtcgcatctacttaaataatagata 969 

III I II I III llll I I II II III III III 
Db 110 ATTTAATTATAAATATTTTAATTATAAAATATTTAATTATAAATATTTTAATTATAAATA 51 

Qy 970 aattaattgtggtacattagatcaaagaa 998 

llllll I I III II I I II 



Db 50 TTTTAATTATAAAATATTTAATTATAAAA 22 



RESULT 14 
US-07-867-106-2/C 

Sequence 2, Application US/07867106 

Patent No. 5389526 
GENERAL INFORMATION: 



APPLICANT 
APPLICANT 
APPLICANT 



Slade, Martin B 
Chang, Andy C M 
Williams, Keith L 
TITLE OF INVENTION: Improved Plasmid Vectors for Cellular 
TITLE OF INVENTION: Slime Moulds of the Genus Dictyostelium 
NUMBER OF SEQUENCES: 19 
CORRESPONDENCE ADDRESS: 
ADDRESSEE: Woodcock Washburn Kurtz Mackiewicz & No. 5389526ris 
STREET: One Liberty Place 46th Floor 
CITY: Philadelphia 
STATE: PA 
COUNTRY: USA 
ZIP: 19103 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
SOFTWARE: Patentln Release tl.0, Version 11.25 
CURRENT APPLICATION DATA: 
APPLICATION NUMBER: US/07/867,106 
FILING DATE: 19920625 
PRIOR APPLICATION DATA: 
APPLICATION NUMBER: AU PJ 7187 
APPLICATION NUMBER: PCT/AU90/00530 
FILING DATE: 02-NOV-1989 
ATTORNEY/AGENT INFORMATION: 
NAME: Feeney, Joanne Longo 
REGISTRATION NUMBER: 35,134 
REFERENCE/DOCKET NUMBER: RICE-0002 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 215-568-3100 
TELEFAX: 215-568-3439 
INFORMATION FOR SEQ ID NO: 2: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 5852 base pairs 
TYPE: NUCLEIC ACID 
STRANDEDNESS: single 
TOPOLOGY: linear 

MOLECULE TYPE: DNA (genomic) 

ANT I -SENSE: NO 

FEATURE: 

NAME/KEY: CDS 

LOCATION: 2378.. 5038 
FEATURE: 

NAME/KEY: CDS 

LOCATION: 2378.. 5038 
US-07-867-106-2 



Query Match 2.4%; Score 74.4; DB 1; Length 5852; 

Best Local Similarity 48.3*; Pred. No, 4.9e-05; 

Matches 277; Conservative 0; Mismatches 286; Indels 11; Gaps 

Qy 406 taacaaagtttgattgtgtacatatatatatatatcttcaaattttataataaaaattgt 465 

III llll I II I II lllll llll III llll I I I 
Db 5821 TAATAAAGATCTTTTAAATTTATTAATATACATTTTTAATGGTTATTTAATTTATTTAAT 5762 

Qy 466 gtttaaataatttacagttatattatttttttatctctaattttatttgtcgccaaattt 525 

Db 5761 TATTTGTTATTTGTATTTTtJta™ 5702 

Qy 526 ttagttgatattttaacataaaaaaaattgtacacatttacaagcccatatacaaataat 585 

II II lllll I I II II I llll I I I III II 
Db 5701 CTATTTTTTATTTATAAAATTAATAATTAAATTTTAAATAAATAAAAAAAAAAAAAAAAA 5642 
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Qy 2183 ataatttaacaacaatatttaatattattattattattatttctcaatttttattaaaca 2242 

III I I I I I III II I I II I II II II I I I I I I 
Db 1505 ATATTCAATTATATTTGATAAATTTTTTAAATAATTATAATTTTCTTTATATGATTGATA 1446 

Qy 2243 aaaacataaatttttgacaaattaaaataaatgaattaatttctcaatttttcgtgcaac 2302 

I IIIIIMI I I I II II I II IN I I III I III 

Db 1445 ATCACATAAATGAGTTATACTTTTTGGAAATTTTATCAAATGTATATTATTTTTTTTAAC 1386 

Qy 2303 tattacaaaaatccttcatagtcctaatcttaatttgatgcagaggtgataataatctta 2362 

I I II II I II I III I I I I I I mum 

Db 1385 ATTGAAAGATATATCTAATTTTTTTAAATTAATTAATTTTTCTATAAATTTATAATCTTT 1326 

Qy 2363 atttgatgcagaggtaataatgggccgggtttgagctggacttaagcatgatattgacgt 2422 

II I II III I I II I II III 

Db 1325 TATTTGTATTGATTCCATATTTAACTCAACTATACTAATAGGAAAAACATTATTAAAGTT 1266 

Qy 2423 actttatatttttccaaattcaacccagctcgaaatatgagtctaaaattttgtccaatt 2482 

II II I III I II I I I I II II II I Mil 

Db 1265 ACCAAATTTATTTTTAGATATTATTAATTTTTTTAAATTTACTAAATTATTAATAAAATT 1206 

Qy 2483 taatccaagcccattttaagttcgtccatattattttttaatttaaaaaatttatatcat 2542 

I II III I I I I II I I II II II I III II 

Db 1205 ATAGTCATTTATATTACATGATTCACAATTTAAAAATTCTATAGAATGTGGTAGTATAAT 1146 
Qy 2543 tttattttaatatttaattattttatatattttttatttattgaaaatttttatatagtc 2602 

iii i iii i mm iii m 1 1 ii ii iii i i 

Db 1145 ATTACT — TATATTGCTATTTTTGTTATAAGATATATCTAAATATGTTATATTTTTTA 1090 

Qy 2603 atcttaacattatgttaatgtttatattagagtagtattatatatatttagtataggttt 2662 

II II III III II III II I I I I llllll III II 
■Db 1089 ATTTTGTTATAAAATTTAAATTAATAATATTTAAATTTGAAATATATAAACTTTTAATAT 1030 

Qy 2663 attttgttaataaacttaaaaatgggtcttgtgggctagacttggaccttaaatgctcaa 2722 

II II Mil II lllll I II I I I I II II 

Db 1029 TTTCTG-GAATATTATTTAAAATATTATTATCATAATATATTATATGCAATTCTTCTAAA 971 

Qy 2723 actcaaacttaattcatattttaaacaggcttaatatttttatttacactgtttcaaatt 2782 

I II II I I I II I III III I I I II II 
Db 970 TTAACTAATTTTTTTAATATATTAATATTAATAACATTATCTCTGTTTATTATTATTTTT 911 

Qy 2783 tttcgggtgaaatatcttcgagtctagattaataacaccacaggtctaatttgatgctca 2842 

' Ill I lllll II I I I lllll II III I I I II I 
Db 910 TTTAAATTATAATATTTTAAAATATTTATTAAAATTATATCAGAATTTAGTAAATCCATT 851 

Qy 2843 atgaaaatgaaatcatattgagcttaattaatattccatt 2882 

III III II I II II llllll II II 

Db 850 TTGATAATTTTATTTTTTTTTCATTGATTAATTTTTTTTT 811 



RESULT 11 
US-0B-544-332-8/C 

Sequence 8, Application US/08544332 

Patent No. 5935777 
GENERAL INFORMATION: 



APPLICANT 
APPLICANT 
APPLICANT 



Moyer, Richard W. 
Hall, Richard L. 
Gruidl, Michael E. 
TITLE OF INVENTION: No. 5935777el Entomopoxvirus Expression System 
NUMBER OF SEQUENCES: 77 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Gerard H. Bencen 

STREET: 2421 N.W. 41st Street, Suite A-l 

CITY: Gainesville 

STATE: FL 

COUNTRY: USA 

ZIP: 32606 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC -DOS/MS -DOS 

SOFTWARE: Patentln Release tl.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/544,332 



FILING DATE: 

CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/991,867 

FILING DATE: 07-DEC-1992 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/107,755 

FILING DATE: 19-AUG-1993 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: WO 92/14818 

FILING DATE: 12-FEB-1992 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/827,685 

FILING DATE: 30-JAN-1992 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/657,584 

FILING DATE: 19-FEB-1991 
ATTORNEY/AGENT INFORMATION: 

NAME: Bencen, Gerard H, 

REGISTRATION NUMBER: 35,746 

REFERENCE/DOCKET NUMBER: UF114.C4 
TELECOMMUNICATION INFORMATION; 

TELEPHONE: 904-375-8100 

TELEFAX: 904-372-5800 
INFORMATION FOR SEQ ID NO: 8: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 1511 base pairs 

TYPE: nucleic acid 

STRANDEDNESS: double 

TOPOLOGY: unknown 
MOLECULE TYPE: DNA (genomic) 
ORIGINAL SOURCE: 

ORGANISM: Amsacta moorei entemopoxvirus 
FEATURE: 

NAME/KEY: CDS 

LOCATION: complement (18.. 218) 
FEATURE: 
NAME/KEY: CDS 

LOCATION: complement (234,. 782) 



FEATURE: 
NAME/KEY: 
LOCATION: 
US-06-544-332-8 



CDS 

852. .1511 



Query Match 2.5%; Score 76.4; DB 4; Length 1511; 

Best Local Similarity 46.34; Pred. no, 1.9e-05; 

Matches 324; Conservative 0; Mismatches 371; Indels 5; Gaps 

Qy 2183 ataatttaacaacaatatttaatattattattattattatttctcaatttttattaaaca 2242 

III I I I I I III II I I II I II INI I I I I I I 
Db 1505 ATATTCAATTATATTTGATAAATTTTTTAAATAATTATAATTTTCTTTATATGATTGATA 1446 

Qy 2243 aaaacataaatttttgacaaattaaaataaatgaattaatttctcaatttttcgtgcaac 2302 

I IIIIIMI I I I II II I II II I I I III I III 

Db 1445 ATCACATAAATGAGTTATACTTTTTGGAAATTTTATCAAATGTATATTATTTTTTTTAAC 1386 

Qy 2303 tattacaaaaatccttcatagtcctaatcttaatttgatgcagaggtgataataatctta 2362 

I I I I II I II I III I I I I I I MIIIMI 

Db 1385 ATTGAAAGATATATCTAATTTTTTTAAATTAATTAATTTTTCTATAAATTTATAATCTTT 1326 

Qy 2363 atttgatgcagaggtaataatgggccgggtttgagctggacttaagcatgatattgacgt 2422 

II I II UN I II I II III 

Db 1325 TATTTGTATTGATTCCATATTTAACTCAACTATACTAATAGGAAAAACATTATTAAAGTT 1266 

Qy 2423 actttatatttttccaaattcaacccagctcgaaatatgagtctaaaattttgtccaatt 2482 

II II I III I II I I I I II II II I llll 

Db 1265 ACCAAATTTATTTTTAGATATTATTAATTTTTTTAAATTTACTAAATTATTAATAAAATT 1206 

Qy 2483 taatccaagcccattttaagttcgtccatattattttttaatttaaaaaatttatatcat 2542 

III III II I I II I I II II II I III II 

Db 1205 ATAGTCATTTATATTACATGATTCACAATTTAAAAATTCTATAGAATGTGGTAGTATAAT 1146 
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Qy 2357 atcttaatttgatgcagaggtaataatgggccgggtttgagctggacttaagcatgatat 2416 

I II III I I II II I I II M II 

Db 1993 TTTTTTTTTTTTTTACTTTGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCTCATTA 2052 

Qy 2417 tgacgtactttatatttttccaaattcaacccagctcgaaatatgagtctaaaattttgt 2476 

II I III III I llll I II II I 

Db 2053 TAAATATTAATTACTTTGGTTTTTTTTGATTTTTTTTTTAATAAATTTAAAATTTTATTC 2112 

Qy 2477 ccaatttaatccaagcccattttaagttcgtccatattattttttaatttaaaaaattta 2536 

I II llll I I llll I I I III III I I II I I III 
Db 2113 TCTATCTAATTATACCTTATTTATAAATATTGGATAATATATCAAATATTTATCAGTTT- 2171 

Qy 2537 tatcattttattttaatatttaattattttatatattttttatttattgaaaatttttat 2596 

Db 2172 TGGCATGACAATTTTAATTAWTTTATTTTTTGAT 2231 

Qy 2597 atagtcatcttaacattatgttaatgtttatattagagtagtattatatatatttagtat 2656 

I I I I II II I III I II II II I I llll I I I 

Db 2232 AAATTTCTTTTTTTTTTTTTTTATTTTTAATTTTTAATTTTTATTTTTCCCACACTTTCA 2291 

Qy 2657 aggtttattttgttaataaacttaaa 2682 

II II I III I I llll 
Db 2292 TTTTATTTTATTTTATTTATTGTAAA 2317 



RESULT 8 
US-08-451-405A-2/C 

Sequence 2, Application US/08451405A 

Patent No. 5736358 
GENERAL INFORMATION: 
APPLICANT: FASEL, NICOLAS JOSEPH 
APPLICANT: REYMOND, CHRISTOPHE DOMINIQUE 
TITLE OF INVENTION: DICTYOSTELID EXPRESSION VECTOR AND 
TITLE OF INVENTION: METHOD FOR EXPRESSING A DESIRED PROTEIN 
NUMBER OF SEQUENCES: 3 



ADDRESSEE: THE WEBB LAW FIRM 

STREET: 700 KOPPERS BUILDING, 436 SEVENTH AVENUE 

CITY: PITTSBURGH 

STATE: PENNSYLVANIA 

COUNTRY : UNITED STATES OF AMERICA 

ZIP: 15219-1818 
COMPUTER READABLE FORM: 

MEDIUM TYPE: 3.5" FLOPPY DISK 

COMPUTER: Midwest Micro 486-50 

OPERATING SYSTEM: DOS 

SOFTWARE: WORDPERFECT 6.1 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/451, 405A 

FILING DATE: 26-MAY-1995 

CLASSIFICATION: 435 
PRIOR APPLICATION DATA; 

APPLICATION NUMBER: 07/965,273 

FILING DATE: 15-JAN-1993 
INFORMATION FOR SEQ ID NO: 2: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 731 

TYPE: NUCLEIC ACID 

STRANDEDNESS : SINGLE 

TOPOLOGY: UNKNOWN 
-08-451-405A-2 



Query Match 2.6%; Score 78,4; DB 2; Length 731; 

Best Local Similarity 47.3%; Pred. No. 7.9e-06; 

Matches 299; Conservative 0; Mismatches 331; Indels 2; Gaps 

Qy 2149 attatttataaattgttagaatgattatttttcaataatttaacaacaatatttaatatt 2208 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 II III llll I 1 1 1 1 1 II II M II II 
Db 730 ATTTTTTTTAAATTGTTTGTTTTAATTGTTTTTTTTTTTTTAAAAAAAAAAATAAAATTT 671 

Qy 2209 attattattattatttctcaatttttattaaacaaaaacataaatttttgacaaattaaa 2268 

II llll I III I I III II II II I I III 



Db 670 TGAATGATTA'AAGAAAAAAATAATAAAAAAATAATAATGTAGAAAAAGGTATTTTTATT 612 

Qy 2269 ataaatgaattaatttctcaatttttcgtgcaactattacaaaaatccttcatagtccta 2328 

I III Mill I I I I II I II II I II III II 
Db 611 AAAAAGAAATTATTATTACTACTATTAGGAAAATATTTTTATTTCTAATTGATATATATA 552 

Qy 2329 atcttaatttgatgcagaggtgataataatcttaatttgatgcagaggtaataatgggcc 2388 

I I III I I II II I II I Mill II 
Db 551 AAATAAATAAATAATAAATGTTATTTGTTTGATTAAAGGGGTTGATGGTAAAAAAAAAAA 492 

Qy 2389 gggtttgagctggacttaagcatgatattgacgtactttatatttttccaaattcaaccc 2448 

II II I llll I I I III II III II 

Db 491 AAAAATTAAAAAATTAAAAAAAATTTATTTAAAAAAATAATAATTAAATAAAAAAAATTA 432 

Qy 2449 agctcgaaatatgagtctaaaattttgtccaatttaatccaagcccattttaagttcgtc 2508 

I I II I I III II I III I llll 
Db 431 AAATTTTAA-ACCTTTGTAAGATAATAGAGTGTGTTAAAAGTGTGTTTTGCACAATTAAA 373 

Qy 2509 catattattttttaatttaaaaaatttatatcattttattttaatatttaattattttat 2568 

iiii i ii minium mm mi i ii mi ii i 

Db 372 GATATATAATAATATTTTAAAAAATTTCTTTAAATACCTTTTTTTTTTGATTTTAATTTT 313 
Qy 2569 atattttttatttattgaaaatttttatatagtcatcttaacattatgttaatgtttata 2628 

i linn iii i inn i i i iii ii iii mm i 

Db 312 TTTTTTTTTTTTTTTGATTTTTTTTTTTTTTTTTTTTTTTTTTTTTGGTTGTTGTTTCCA 253 

Qy 2629 ttagagtagtattatatatatttagtataggtttattttgttaataaacttaaaaatggg 2688 

I I I I II II III llll II III M I II I II 
Db 252 ATCCAAAAATTATCTGAATTTTTTTTTTAGAATTTTCTTATCATATACCGTCACAAATCT 193 

Qy 2689 tcttgtgggctagacttggaccttaaatgctcaaactcaaacttaattcatattttaaac 2748 

III II I II II I I III llll II III' 
Db 192 ATTTTTAGGTTCACTATGTTTAATATAATTTTAAATACAAATAAAAACTCCTTCAAATAG 133 

Qy 2749 aggcttaatatttttatttacactgtttcaaa 2780 

I I llll I llll I I III III 
Db 132 AAATATTTTATTCTAATTTTTATTTTTTAAAA 101 



RESULT 9 
US-07-991-867B-B/C 

Sequence 8, Application US/07991867B 
Patent No. 5476781 
GENERAL INFORMATION: 

APPLICANT: Moyer, Richard W. 

APPLICANT: Hall, Richard L. 

APPLICANT: Gruidl, Michael E. 

TITLE OF INVENTION: No. 5476781el Entomopoxvirus Expression System 
NUMBER OF SEQUENCES: 66 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: David R, Saliwanchik 

STREET: 2421 N.W. 41st Street, Suite A-l 

CITY: Gainesville 

STATE: FL 

COUNTRY: USA 

ZIP: 32606 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release 11.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/07/991, 867B 

FILING DATE: 12-DEC-1992 

CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: WO 92/14818 

FILING DATE: 12-FEB-1992 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: OS 07/827,685 

FILING DATE: 30-JAN-1992 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/657,584 
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REFERENCE/DOCKET NUMBER: UF114.C4 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 904-375-8100 

TELEFAX: 904-372-5800 
INFORMATION FOR SEQ ID NO: 8: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 1511 base pairs 

TYPE: nucleic acid 

STRANDEDNESS: double 

TOPOLOGY: unknown 

MOLECULE TYPE: DNA (genomic) 
ORIGINAL SOURCE: 

ORGANISM: Amsacta moorei entemopoxvirus 
FEATURE: 

NAME/KEY: CDS 

LOCATION: complement (18., 218) 
FEATURE: 

NAME/KEY: CDS 

LOCATION: complement (234 ,,782) 
FEATURE: 

NAME/KEY: CDS 

LOCATION: 852.. 1511 
-08-544-332-8 



Query Match 2.84; 
Best Local Similarity 43 .6%; 
Matches 573; Conservative 



Score 84.2; DB 4; Length 1511; 
Pred. No. 8.8e-07; 

3; Mismatches 729; Indels 12; Gaps 4; 



Qy 129 ttgtgttacaatataataaatacatcgtagaaataaattttattcaaattgaagtcttaa 188 

II I III II I llll I I II II II II I I III 

Db 68 TTTTATTATTATTTGATAATTGTTTATTTAATTCGTTATTGATATTAACAATATTATTTA 127 

Qy 189 ccatctttaatatttgtagatgtaatttaaatgaaagataaatacatattcttggacatg 248 

III II IMI I II I II I I I I II II 

Db 128 TCATTTTACCTATTTTTTTTTTTCTATCTACTAACGAAATATCAGA1TTTGCACCTTCAA 187 

Qy 249 tattttcatcttaatgtttgtggctttggtgataggtgtattgatgtacgatgtctttta 308 

III II llll I I llll II III III II 

Db 188 TATCAGAATAATAATTATCATTATTTTGCATTTATGAATAAAAATATTAATATGAATTAT 247 

Qy 309 aatcacatatcacattttgagtttgtatgatgataagtcgacataancgaaatatggtgt 368 

II Mill I II I III II II II III I 
Db 248 TATAACATAATCTACACACAGGAACATATAAATCTTGTCCACCTATTTCAATTATTTGAT 307 

Qy' 369 gatcttcacttttgaactttgataagtcaccaaactttaacaaagtttgattgtgtacat 428 

I I I III III II III III II llll I I 

Db 308 TTTTATTATGTTTTTTAATTGTAAAAGAAGCATCTTTATAACAAAATTGACATATAGCTT 367 

Qy 429 atatatatatatcttcaaattttataataaaaattgtgtttaaataatttacagttatat 488 

II I I I II I I II II I I I II I llll I Mill 

Db 368 GTAATTTTTTTATTTTTTCTACTTTAGGAATTAATTTTGATATAGAATT— AAATATAT 424 

Qy 489 tatttttttatctctaattttatttgtcgccaaatttttagttgatattttaacataaaa 548 

I I II I Mill II I III II II llll I I I 
Db 425 TTCTGTTAAAGTCACAATTTAATCCAGCAACAATAACTTTTTTTTTATTATTAGCCATTT 484 

Qy 549 aaaattgtacacatttacaagcccatatacaaataattatataaatattcattaaaaaat 608 

I I II II I I II II I I I II II II I 
Db 485 TATCACAAAATTGTTCTAAATCATTTTCTTCAAAAAATTGACACTCATCTATGCCAATAA 544 

Qy 609 atatttaaatataggatataaatataactattttagaattattctactttaagataacat 668 

III III III II I I I lllllll II I I III 
Db 545 TATCATAATTATCTACGATATTGATTTCATTAATTAAATTATTTGTTTTAATGTATAAAT 604 

Qy 669 aggttaaatgtataattaataaggttagtttattgtaaagatgagtatatatgtcgtaaa 728 

I I II II II I I I II I I I I I I II 
Db 605 ATTCTTTATTTAATATATTTCCGTCATGATTTATTATATTTTTATTTATAAATCTATTAT 664 

Qy 729 cataatcactaaccatttttattaacttcttggttttgaagttccaaaaagaaaatggaa 788 

I II I I II III I II I I II I I Ml I 

Db 665 CTATATTATGAGTTATAATTACACATTTTTGATTAGATAAAATATATCTATTAATTTTTC 724 



Qy 789 gggaaatttgagagtaagttcatgtttatattatacataatgaagttgatgttttc-ttc 847 

I llll I I I I II II I I I I I I II 

Db 725 GCATCAATTCTGTTGTTTTGCCAGAAAACATAGGACCAATTATTAATTCTATCGACATTT 784 

Qy 848 tttttaatatttttatacaaaatatttaaataaaataattaaggattgaatgaaaaatat 907 

IIIM II III III I II III Mill III II II II II 
Db 785 TTTTTTATTATTTGATATATTTTTTCAAAAAAAAATTAATCAATGAAAAAAAAATAAAAT 844 

Qy 908 aatgaaagtcgttttactaatagtcatattgcattttgtcgcatctacttaaataataga 967 

ii ill i i linn i n n inn ii i i inn n 

Db 845 TATCAAAATGGATTTACTAA-ATTCTGATATAATTTTAATAAATATTTTAAAATATTA-- 901 

Qy 968 taaattaattgtggtacattagatcaaagaacaaactagattttgtcccattctattgtt 1027 

Hill llll II I I I Ml I llll I 
Db 902 TAATTTAAAAAAAATAATAATAAACAGAGATAATGTTATTAATATTAATATATTA 956 

Qy 1028 aaaagctggtccgtttacattaaaataaggtacatgttacatgccacgtataactatctg 1087 

llll I II III III III I III I 
Db 957 AAAAAATTAGTTAATTTAGAAGAATTGCATATAATATATTATGATAATAATATTTTAAAT 1016 

Qy 1088 gttattctatcaatcacgctaatttttaacagtagaaatgaatgtaatttttaaatagaa 1147 

lllll llll II III I I I llll I II I II I III 

Db 1017 AATATTCCAGAAAATATTAAAAGTTTATATATTTCAAATTTAAATATTATTAATTTAAAT 1076 

Qy 1148 agggtcaaattgttatttgatctaacacgtagggattaatttacttattttcctaaagaa 1207 

III III II lllll I lllll II I III 
Db 1077 TTTATAACAAAATTAAAAAATATAACATATTTAGATATATCTTATAACAAAAATAGCAAT 1136 

Qy 1208 ataagtaaaatataatttgaatcttaatacaaaaactttcatgatacttttatcatattt 1267 

llllllll II I I III II lllll I II 
Db 1137 ATAAGTAATATTATACTACCACATTCTATAGAATTTTTAAATTGTGAATCATGTAATATA 1196 

Qy 1268 tacttataatttaatattgtgagagtaacaaarttaaaaaacatagaaacaccaaaagtt 1327 

I II I llll I II III llllllll II II I I III I 
Db 1197 AATGACTATAATTTTATTAATAATTTAGTAAATTTAAAAAAATTAATAATATCTAAAAAT 1256 

Qy 1328 agttatggtgtgactcatatacacagttaaaatttgaataaatttttttcttcgtcatta 1387 

I I llll llll llll II M II I 

Db 1257 AAATTTGGTAACTTTAATAATGTTTTTCCTATTAGTATAGTTGAGTTAAATATGGAATCA 1316 

Qy 1388 attccatcatgggtttttttttttctagttaagccataattatcaaaataatca 1441 

lllll I II I III III II I I Mil III I 
Db 1317 ATACAAATAAAAGATTATAAATTTATAGAAAAATTAATTAATTTAAAAAAATTA 1370 



RESULT 6 
US-08-B83-795A-36 

Sequence 36, Application US/08883795A 
Patent No. 5985607 
GENERAL INFORMATION: 

APPLICANT: Delcuve, Genevieve 

APPLICANT: Awang, Gregor 

TITLE OF INVENTION: Recombinant DNA Molecules and Expression 
TITLE OF INVENTION: Vectors for Tissue Plasminogen Activator 
NUMBER OF SEQUENCES: 39 



BERESRIN S PARR 

STREET: 40 King Street West 

CITY: Toronto 

STATE: Ontario 

COUNTRY: Canada 

ZIP: M5H 3Y2 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release il.0, Version 11.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/883 ,795A 

FILING DATE: 27-JUN-1997 

CLASSIFICATION: 435 
ATTORNEY/AGENT INFORMATION: 

NAME: Gravelle, Micheline 
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Query Match 2.84; Score 84.2; DB 1; Length 1511; 

Best Local Similarity 43.6%; Pred. No. 8.8e-07; 

Matches 573; Conservative 0; Mismatches 729; Indels 12; Gaps 4; 

Qy 129 ttgtgttacaatataataaatacatcgtagaaataaattttattcaaattgaagtcttaa 188 

II I III II I MM I I I I II II II I I II I 
Db 68 TTTTATTATTATTTGATAATTGTTTATTTAATTCGTTATTGATATTAACAATATTATTTA 127 



.89 ccatctttaatatttgtagatgtaatttaaatgaaagataaatacatattcttggacatg 248 

III II Mill I II I I I I I I I II II 

,28 TCATTTTACCTATITTTTT1TTTCTATCTACTAACGAAATATCAGATTTTGCACCTTCAA 187 

:49 tattttcatcttaatgtttgtggctttggtgataggtgtattgatgtacgatgtctttta 308 

III II llll I I MM II I II III II 

,88 TATCAGAATAATAATTATCATTATTTTGCATTTATGAATAAAAATATTAATATGAATTAT 247 

'09 aatcacatatcacattttgagtttgtatgatgataagtcgacataancgaaatatggtgt 368 
II HIM I II I III II II II III I 

TATAACATAATCTACACACAGGAACATATAAATCTTGTCCACCTATTTCAATIATTTGAT 307 

69 gatcttcacttttgaactttgataagtcaccaaactttaacaaagtttgattgtgtacat 428 

I I I III III II I II II I II llll 'II 

TTTTATTATGTTTTTTAATTGTAAAAGAAGCATCTTTATAACAAAATTGACATATAGCTT 367 

29 atatatatatatcttcaaattttataataaaaattgtgtttaaataatttacagttatat 488 

II I I I II I I II II I I I II I llll I Mill 
GTAATTTTTTTATTTTTTCTACTTTAGGAATTAATTTTGATATAGAATT • • * AAATATAT 424 

89 tatttttttatctctaattttatttgtcgccaaatttttagttgatattttaacataaaa 548 

I I II I Mill II I III II II llll I I I 

25 TTCTGTTAAAGTCACAATTTAATCCAGCAACAATAACTTTTTTTTTATTATTAGCCATTT 484 

i49 aaaattgtacacatttacaagcccatatacaaataattatataaatattcattaaaaaat 608 

I I II II I I II II I I I II II II I 

85 TATCACAAAATTGTTCTAAATCATTTTCTTCAAAAAATTGACACTCATCTATGCCAATAA 544 

09 atatttaaatataggatataaatataactattttagaattattctactttaagataacat 668 

III III III II I I I lllllll llll I II 

>45 TATCATAATTATCTACGATATTGATTTCATTAATTAAATTATTTGTTTTAATGTATAAAT 604 

69 aggttaaatgtataattaataaggttagtttattgtaaagatgagtatatatgtcgtaaa 728 

I I II II II I I I II I I I I I I II 

i05 ATTCTTTATTTAATATATTTCCGTCATGATTTATTATATTTTTATTTATAAATGTATTAT 664 

29 cataatcactaaccatttttattaacttcttggttttgaagttccaaaaagaaaatggaa 788 
I llll II III llll I llll I II I 

)65 CTATATTATGAGTTATAATTACACATTTTTGATTAGATAAAATATATCTATTAATTTTTC 724 

gggaaatttgagagtaagttcatgtttatattatacataatgaagttgatgttttc-ttc 847 

I llll I I I I II II I I I I I I II 

'25 GCATCAATTCTGTTGTTTTGCCAGAAAACATAGGACCAATTATTAATTCTATCGACATTT 784 

tttttaatatttttatacaaaatatttaaataaaataattaaggattgaatgaaaaatat 907 

Mill II III III I II III lllll III II II II II 

'85 TTTTTTATTATTTGATATATTTTTTCAAAAAAAAATTAATCAATGAAAAAAAAATAAAAT 844 

aatgaaagtcgttttactaatagtcatattgcattttgtcgcatctacttaaataataga 967 

II III I I NIMH I II II lllll II I I lllll II 

845 TATCAAAATGGATTTACTAA-ATTCTGATATAATTTTAATAAATATTTTAAAATATTA-- 901 

taaattaattgtggtacattagatcaaagaacaaactagattttgtcccattctattgtt 1027 
lllll I I II II I I I I II I I III I 

102 TAATTTAAAAAAAATAATAATAAACAGAGATAATGTTATTAATATTAATATATTA 956 



Qy 1028 aaaagctggtccgtttacattaaaataaggtacatgttacatgccacgtataactatctg 1087 

Mil I II III II I III I III I 

Db 957 AAAAAATTAGTTAATTTAGAAGAATTGCATATAATATATTATGATAATAATATTTTAAAT 1016 

Qy 1088 gttattctatcaatcacgctaatttttaacagtagaaatgaatgtaatttttaaatagaa 1147 

lllll I II I II III I I I llll I II I II I II I 

Db 1017 AATATTCCAGAAAATATTAAAAGTTTATATATTTCAAATTTAAATATTATTAATTTAAAT 1076 



Qy 1148 agggtcaaattgttatttgatctaacacgtagggattaatttacttattttcctaaagaa 1207 

I I III II lllll I III II I I I II I 
Db 1077 TTTATAACAAAATTAAAAAATATAACATATTTAGATATATCTTATAACAAAAATAGCAAT 1136 

Qy 1208 ataagtaaaatataatttgaatcttaatacaaaaactttcatgatacttttatcatattt 1267 

llllllll II I I I II II II II I I II 
Db 1137 ATAAGIAATAITATACTACCACATTCTATAGAATTTTTAAATTGTGAATCATGTAATATA 1196 

Qy 1268 tacttataatttaatattgtgagagtaacaaarttaaaaaacatagaaacaccaaaagtt 1327 

I II I llll I II III llllllll II llll III I 
Db 1197 AATGACTATAATTTTATTAATAATTTAGTAAATTTAAAAAAATTAATAATATCTAAAAAT 1256 

Qy 1328 agttatggtgtgactcatatacacagttaaaatttgaataaatttttttcttcgtcatta 1387 

I I llll I III llll II II II I 

Db 1257 AAATTTGGTAACTTTAATAATGTTTTTCCTATTAGTATAGTTGAGTTAAATATGGAATCA 1316 

Qy 1388 attccatcatgggtttttttttttctagttaagccataattatcaaaataatca 1441 

II I I I I II I III III II I I llll III I 

Db 1317 ATACAAATAAAAGATTATAAATTTATAGAAAAATTAATTAATTTAAAAAAATTA 1370 



SSULT 4 
H8-107-755A-8 

Sequence 8, Application US/08107755A 
Patent No, 5721352 
GENERAL INFORMATION: 

APPLICANT: Moyer, Richard W. 

APPLICANT : Hall, Richard L. 

APPLICANT: Gruidl, Michael E. 

TITLE OF INVENTION: No, 5721352el Entomopoxvirus Expression System 
NUMBER OF SEQUENCES: 40 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: David R. Saliwanchik 

STREET: 2421 N.W. 41st Street, Suite A-l 

CITY: Gainesville 

STATE: Florida 

COUNTRY: U.S.A. 

ZIP: 32606 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC 'DOS/MS -DOS 

SOFTWARE: Patentln Release *1.0, Version 11.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/107, 755A 

FILING DATE: 19*1993 

CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/827,658 

FILING DATE: 30-JAN-1992 ' 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/657,584 

FILING DATE: 19-FEB-1991 
ATTORNEY/AGENT INFORMATION: 

NAME: Saliwanchik, David R. 

REGISTRATION NUMBER: 31,794 

REFERENCE/DOCKET NUMBER: DFU4.C2 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (904) 375-8100 

TELEFAX: (904) 372-5800 
INFORMATION FOR SEQ ID NO: 8: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 1511 base pairs 

TYPE: nucleic acid 

STRANDEDNESS: double 

TOPOLOGY: unknown 
MOLECULE TYPE: DNA (genomic) 
ORIGINAL SOURCE: 

ORGANISM: Amsacta moorei entemopoxvirus 
FEATURE: 

NAME/KEY: CDS 

LOCATION: complement (18,. 218) 
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Query Match 3.2%; Score 98.8; DB 4; Length 19124; 

Best Local Similarity 45.3*; Pred. No. 3.7e-09; 

Matches 561; Conservative 1; Mismatches 659; Indels 18; Gaps 5; 

39 atataataaatacatcgtagaaataaattttattcaaattgaagtcttaaccatctttaa 198 
Mill I I III! I III I I I III I I Mil I I! 
Db 7210 ATATATGTATATTACAGTAGTATTATAATATGGTAGAATAAGAATAATAACACTTTTGTG 7151 

.99 tatttgtagatgtaatttaaatgaaagataaatacatattcttggacatgtattttcatc 258 

II Ml II II I I I II II II III III! I 
50 AATGTATATATATATGTAAGGTATAATTTATGTATTACAATATATATAAATATT— GTA 7094 

!59 ttaatgtttgtggctttggtgataggtgtattgatgtacgatgtcttttaaatcacatat 318 

I II I I I III llll MM I I II II I I I I II I 
193 TATATATATATATATATATTAATAGTTGTACTATTATAATATTACAATATATGTATATGT 7034 

119 cacattttgagtttgtatgatgataagtcgacataancgaaatatggtgtgatcttcact 378 

II I I I II Ml I II I I I I I I II I 
7033 TAAAAAAATAATATTTAATATGTATATAATAATAATTATTAGTTTTATATATTTTTAAAA 6974 

1 7 9 tttgaactttgataagtcaccaaactttaacaaagtttgattgtgtacatatatatatat 4 3 8 
I I III I I III I! I II III I I I MINI 

173 AAAATATATATATATATTAATAAATTTATAATAAATTTAAATATTCTAACAAAAAAAAAA 6914 

,39 atcttcaaattttataataaaaattgtgtttaaataatttacagttatattattttttta 498 

I III III III II II II I II Mill I I II 
113 TATAATCAGAAATATTATATTTTATGTATTCCTTTATTTATCTATTTAATTATATATATA 6854 

,99 tctctaattttatttgtcgccaaatttttagttgatattttaacataaaaaaaattgtac 558 

I I HUM I I llll I I I II I II I Mill I 
6853 TTATTTTTTTTATGTTT TATTTATTAAGTAAAATIATAAATGAGAAAAAAAAAAT 6799 

159 acatttacaagcccatatacaaataattatataaatattcattaaaaaatatatttaaat 618 

II I I Mllll III Mill I I III I I I III I 
ACGAAAATACAAACATATAAAAAAGTATATAIGCAACGTGTTTATAIATTTAATTATTAA 6739 

119 ataggatataaatataactattttagaattattctactttaagata- -acataggttaa 675 

Mill llll I llll Mill II llll III I III I 
'38 CATTAATATATGTATATTTTTTTTGACTTTATTTTAATTTATTATATATATATATATATT 6679 

i76 atgtataattaataaggttagtttattgtaaagatgagtatatatgtcgtaaacataatc 735 

'78 AGAGATMCAaL^^ 6619 

'36 actaaccatttttattaacttcttggttttgaagttccaaaaagaaaatggaagggaaat 795 

II III III I I I I I I I I II I! M 
118 TTTATATATTCATATATATATAATTGATATAGATACATATTCTTTGTATTGTTGTATTAT 6559 

'96 ttgagagtaagttcatgtttatattatacataatgaagttgatgttttcttctttttaat 855 

I I I III II llll III I II II I I I II II 
158 ATTAAAAGTAGTATATTATTATTATTAATTTTTGTTGTTATATATTATAATTTATTATAT 6499 

856 atttttatacaaaatatttaaataaa ■ ■ ■ -ataattaaggattgaatgaaaaatataatg 911 

I I llll I I II III III llll II! I Ml llll II 
98 AATAATATATATAGCATCAAAAAAAAAATGATAAATAAATAACAGGAAAAATATATTATT 6439 

112 aaagtcgttttactaatagtcatattgcattttgtcgcatctacttaaataatagataaa 971 

I I I II I II! I Ml III I III I II I I 

38 ATATTATATTATATTATATTAATAAAAATGTTTTTATCATTTGTTTTGTTGTATTTTTTT 6379 

172 ttaattgtggtacattagatcaaagaacaaactagattttgtcccattctattgttaaaa 103 1 

llll llll II II III MM I I II I III 

78 ATGTATTTCATGCATTTTATGAATTTCAAAATTTTATTGTATAATATAAAAAAATAAGTA 6319 

1032 gctggtccgtttacattaaaataaggtacatgttacatgccacgtataactatctggtta 1091 
I III I I Ml I I I I II II II I 

18 AAAATACACATTATAAATATATATATTCAAATATGAGTTATTAATAAAATGTTCA- • -TG 6262 

1092 ttctatcaatcacgctaatttttaacagtagaaatgaatgtaatttttaaatagaaaggg 1151 
Mllll II III I II I I I II I III II II I 
!61 TTCTATATATTTATATAAATGAAAATATTTGTTATAATATAAATACATATATGCTACTAT 6202 

1152 tcaaattgttatttgatctaacacgtagggattaatttacttattttcctaaagaaataa 1211 



llll I M llll I III I III II II I I I II 
Db 6201 ATAAATATTAATAATATCTTTAAAGTATATACTAAAATATATAAAAATGCAIGTATAAAA 6142 

Qy 1212 gtaaaatatantttgaatcttaatacaaaaactttcatgatacttttatcatattttact 1271 

M II! Ill I I! llll III I III I II 
Db 6141 ATAGTATAAAATCATACATATATATATATATATATATATATATATATATATATATATATA 6082 

Qy 1272 tataatttaatattgtgagagtaacaaarttaaaaaacatagaaacaccaaaagttagtt 1331 

llll I llll III III Ml I I I I I I III 
Db 6081 TATATATGCATATATGTAGAATAAATTTATTTATATTCCAAATACTGATATTGTTTTATA 6022 

Qy 1332 atggtgtgactcatatacacagttaaaatttgaataaat 1370 

I II I I III I I II II II I 
Db 6021 TTTGTTATATTATAATAACAAAAAAGAACGACAAGAAGT 5983 



SSULT 2 
3-08-487-826B-13 

Sequence 13, Application US/08487826B 
Patent No. 5993827 
GENERAL INFORMATION: 
APPLICANT; Sim, Kim L. 

APPLICANT; Chitnis, Chetan 
APPLICANT: Miller, Louis H. 
APPLICANT; Peterson, David S. 
APPLICANT : Su, Xin-zhaun 
APPLICANT; wellems, Thomas E. 

TITLE OP INVENTION: BINDING DOMAINS FROM PLASMODIUM VIVAX 

TITLE OF INVENTION: AND PLASMODIUM FALCIPARUM ERYTHROCYTE BINDING PROTEINS 

NUMBER OF SEQUENCES: 45 



ADDRESSEE: Knobbe Martens Olson & Bear 
STREET: 620 Newport Center Drive 16th Floor 
CITY: Newport Beach 
STATE: California 

COUNTRY: US 

ZIP: 92660 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release 11.0, Version #1 .25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/487, 826B 

FILING DATE: lO'SEP-1993 

CLASSIFICATION: 435 
ATTORNEY/AGENT INFORMATION: 



REGISTRATION NUMBER: 29,655 

REFERENCE/DOCKET NUMBER: NIH121 . 001CP1 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (619) 235-8550 
. TELEFAX: (619) 235-0176 
INFORMATION FOR SEQ ID NO: 13: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 19124 base pairs 

TYPE: nucleic acid 

STRANDEDNESS : single 

TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
HYPOTHETICAL: NO 
ANTI-SENSE: NO 
•08-487-826B-13 



Query Match 3.24; Score 96,2; DB 4; Length 19124; 

Best Local Similarity 45,1%; Pred. No. le-08; 

Matches 591; Conservative 0; Mismatches 709; Indels 10; Gaps 6; 

Qy 138 aatataataaatacatcgtagaaataaattttattcaaattgaagtct---taaccatct 194 

Ml! I II I! Mllll! Ml II I III I II II I 
Db 5418 AATACGTAACATGTATTATAGAAATAATAAGAATTTAATATTAAGGATAAATATAAATAT 5477 
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III I III I I I III III III III II Mill I I 

Db 538 ATATTAAGCTATAGTCTTGCTTTGCCTAGTTTCTAGAAAATTAATTATATATAATTATAA 479 

Qy 1141 aa tagaaagggtcaaattgttatttgatctaacacgtagggattaatttacttat 1195 

I I I I I I I I I II II II I II I II I III 
Db 478 TACTTTTTTAATAAGCTATTAGTAAAAATTAATATATTATCTAAGTATATAAAGGCAAAT 419 

Qy 1196 tttcctaaagaaataagtaaaatataatttgaatcttaatacaaaaacttt 1246 

II II I I II Nil II II II II II II II I II 
Db 418 TTATTTAGAAAGTTATGTAAGATTTAGTTAATATAATATTAGAATATACTT 368 



RESULT 14 
Q11710 

ID Q11710 standard; DNA; 5852 BP, 

AC Q1171Q; 

DT 30-JUL-1991 (first entry) 

DE Dictyostelium plasmid Ddp2 containing Rep gene. 

KW slime mould; replication; Rep gene; ss, 

OS Dictyostelium discoideum. 

FH Key Location/Qualifiers 

FT cds 2378. .5041 

FT /*tag- a 

FT /product- involved in extrachromosomal replication 

PN WO9106644-A. 

PD 16-MAY-1991. 

PF 02-NOV-1990; AO0530. 

PR 02-NOV-1989; AU-007187. 

PA (OYMA-) MACQUARIE UNIV. 

PI Slade MB, Chang ACM, Williams KL; 

DR WPI; 91-164194/22. 

DR P-PSDB; R11988. 

PT Polypeptide facilitating extra-chromosomal replication - of 

PT recombinant plasmid in Dictyostelium species 

PS Claim 15; Fig 1; 90pp; English. 

CC The sequence of Ddp2 has been found to contain the putative open 

CC reading frame indicated in the Features Table. The possible ORF is 

CC flanked by regions with similarity to promoter and poly adenylation 

CC signals of known Dictyostelium genes, The RNA and polypeptide 

CC product of the Rep gene have not, however, been detected, it is 

CC believed that the product is produced in low amounts to positively 

CC regulate initiation of plasmid replication, The polypeptide may also 

CC contain regions that act as negative regulators of plasmid copy 

CC number. See also Q11711 and Q11712. 

SO Sequence 5852 BP; 2298 A; 651 C; 708 G; 2195 T; 



Query Match 2.8%; Score 84.4; DB 1; Length 5852; 

Best Local Similarity 46.2%; Pred. No, 0.0011; 

Matches 317; Conservative 0; Mismatches 366; Indels 3; Gaps 1; 

Qy 1997 ttccttttaatcctttctttttacttcattttataacgaattctatggataatgttccct 2056 

II llll II I I III II III I I I I I III 
Db 1635 TTTTTTTTTGTCATGACACTTTTTTTTTTTTGTCATGACACTTTTTTTTTAAAAAAAAAA 1694 

Qy 2057 acaaacatgtcattacaatgtttaattataaattccattcttctattttactaagatatt 2116 

I III llll I I I I III II I I III I llll I Hill 
Db 1695 AAAAAAATGTTAAAATACTATTTGATGACATTCATTTTTCCTAGTTTTTTTTTAGATAGA 1754 

Qy 2117 agtaacttcaaactgctgatttttactaatttattatttataaattgttagaatgattat 2176 

III III III II II I III I I I I llll I III 
Db 1755 TATAAAAATAAATTGCCTATCGATATATACTTAATTTATTAAGATTGAATAATATTTTAA 1814 

Qy 2177 ttttcaataatttaacaacaatatttaatattattattattattatttctcaatttttat 2236 

llll Ml II I III | || II II II I I II I II UN I 
Db 1815 TTTTTAATAAATTCTACTTTTTTTTTTTTTTTCTTTTTTTTTTAAATTTTAAAATTTTTT 1874 

Qy 2237 taaacaaaaacataaatttttgacaaattaaaataaatgaattaatttctcaatttttcg 2296 

I I I II III llll II llll I I II I II 

Db 1875 TTTTTTATTAGATCTCATAATTAAAAATCAATTTAAAATTAAAAGTTATTTTTAAATATG 1934 

Qy 2297 tgcaactattacaaaaatccttcatagtcctaatcttaatttgatgcagaggtgataata 2356 

llll llll llll I III II II III 



Db 1935 CAAAAACTATAAAAAACTAATGTAGTTTAACCAACTTTTTTCTATTTCTTTTTTTTTTTT 1994 

Qy 2357 atcttaatttgatgcagaggtaataatgggccgggtttgagctggacttaagcatgatat 2416 

I II III I II II II II III III 

Db 1995 TTTTTTTTTTTACTTTGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCTCATTATAA 2054 

Qy 2417 tgacgtactttatatttttccaaattcaacccagctcgaaatatgagtctaaaattttgt 2476 

I I I I III II I I llll I II II I 
Db 2055 ATATTAATTACTTTGGTTT— TTTTTGATTTTTTTTTTAATAAATTTAAAATTTTATTC 2111 

Qy 2477 ccaatttaatccaagcccattttaagttcgtccatattattttttaatttaaaaaattta 2536 

I II llll I I llll llll I I III I I I II 
Db 2112 TCTATCTAATTATACCTTATTTATAAATATTGGAATAATATATCAAATATTTATCAGTTT 2171 

Qy 2537 tatcattttattttaatatttaattattttatatattttttatttattgaaaatttttat 2596 

I III I III I I II 1 1 1 1 1 1 1 I I III III II Mill 
Db 2172 TGGCATGACAATTTTAATTATATTTATTTTTTGATTAGTTTTTTTTTTTTTTTTTTTTTA 2231 

Qy 2597 atagtcatcttaacattatgttaatgtttatattagagtagtattatatatatttagtat 2656 

Db 2232 wymCTimmm™ 2291 

Qy 2657 aggtttattttgttaataaacttaaa 2682 

I I II I III I I llll 
Db 2292 TTTTATTTTATTTTATTTATTGTAAA 2317 



RESULT 15 
Q28302 

ID Q28302 Standard; DNA; 1511 BP, 

AC Q28302; 

DT 12-FEB-1993 (first entry) 

DE AmEPV tk DNA, 

KW Entomopoxvirus; thymidine kinase; non-essential; regulatory sequences; 

KW vector; ss. 

OS Amsacta moorei. 

FH Key Location/Qualifiers 

FT Cds 852. .1511 

FT /*tag- a 

FT /label- ORF_Q3 

FT cds complement (234. .782) 

FT /*tag- b 

FT /label- ORF.Q2 

FT CDS complement (17, .218) 

ft /*tag- c 

FT /label- 0RF_Q1 

FT promoter 750, .890 

FT /*tag- d 

PN W09214818-A, 

PD 03-SEP-1992. 

PF 12-FEB-1992; U00855. 

PR 19-FEB-1991; US-6575B4. 

PR 30-JAN-1992; US-827685. 

PA (DTFL ) UNIV FLORIDA. 

PI Gruidl ME, Hall RL, Moyer RW; 

DR WPI; 92-316172/38. 

DR P-PSDB; R29653-55. 

PT New viral vectors and chimeric vaccines - comprise entomopoxvirus 

PT expression system contg. spheroidin or thymidine kinase sequences 

PS Disclosure; Fig 3; HOpp; English, 

CC The sequence given is derived from the Entomopoxvirus, Amsacta moorei 

CC (AmEPV) and contains the thymidine kinase (tk) DNA sequence. The open 

CC reading frames indicated in the features table encode the tk protein 

CC itself and also other structural or regulatory genes associated with 

CC tk. The tk gene maps near the left end of the physical map of the 

CC AmEPV genome. This gene is not highly related to any other 

CC vertebrate poxvirus tk gene. Thymidine kinase is a non-essential 

CC protein which makes it's gene desirable as a. site for the insertion of 

CC exogenous DNA. 

SQ Sequence 1511 BP; 640 A; 128 C; 98 G; 645 T; 



Query Match 



2,8%; Score 84.2; DB 1; Length 1511; 
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PA (RPRG-) RPR GENCELL ASIA PACIFIC INC. 

PI Haraada H; 

DR WPI; 99-243728/20, 

PT New apoptos is - res tstant virus-sensitive cell 

PS Example 1; Page 34-38; 51pp; English. 

CC The present invention describes an apoptos is - res i s tan t virus-sensitive 

CC cell line into which an apoptosis resistance gene has been introduced. 

CC The recombinant viruses generated are capable of expressing apoptosis- 

CC associated genes . These can then be used in a variety of diseases for 

CC which the induction of apoptosis by gene transfer, or where the 

CC inhibition of harmful apoptosis, is therapeutic. The recombinant viruses 

CC are useful as vectors for gene therapy which can be applied to cancer 

CC therapy for destroying cancer cells selectively, the treatment of 

CC autoimmune diseases and graft rejection reaction, and apoptosis induction 

CC therapy for inflammatory cells in inflammatory diseases. Prior arts have 

CC encountered the problem where if an adenovirus vector capable of 

CC expressing an apoptosis -associated gene is introduced into animal cells, 

CC the cells producing the virus will be destroyed because the period of 

CC time required to induce cell death by apoptosis is shorter than that 

CC required to replicate and produce the virus, resulting in failure to 

CC obtain a recombinant virus having the integrated apoptosis -associated 

CC gene. In this invention an apoptosis-resistant 293 cell line (having an 

CC apoptosis resistant gene introduced) is established and overcomes the 

CC problem. The present sequence represents the cowpox virus bsr gene which 

CC is used in an example from the present invention. 

SO Sequence 7797 BP; 2542 A; 1760 C; 1656 G; 1839 T; 



Query Match 3.0%; Score 90,8; DB 1; Length 7797; 

Best Local Similarity 45.4%; Pred. No. 0.00017; 

Matches 326; Conservative 0; Mismatches 392; Indels 0; Gaps 

Qy 1979 caattttcgcagtataagttccttttaatcctttctttttacttcattttataacgaatt 2038 

inn I ill! mi i iii imi ii mi i n 

Db 5586 CAATTCTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT 5527 

Qy 2039 ctatggataatgttccctacaaacatgtcattacaatgtttaattataaattccattctt 2098 

I I I I II I I I II I III II I II II II 
Db 5526 TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTHTT 5467 

Qy 2099 ctattttactaagatattagtaacttcaaactgctgatttttactaatttattatttata 2158 

I llll I I II I II II lllll I III II III I 

Db 5466 TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT 5407 

Qy 2159 aattgttagaatgattatttttcaataatttaacaacaatatttaatattattattatta 2218 

II II I II lllll I III I III I II II II II 
Db 5406 TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT 5347 

Qy 2219 ttatttctcaatttttattaaacaaaaacataaatttttgacaaattaaaataaatgaat 2278 

II III I lllll II I lllll II I I I 

Db 5346 TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT 5287 

Qy 2279 taatttctcaatttttcgtgcaactattacaaaaatccttcatagtcctaatcttaattt 2338 
Db 5286 XTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT 5227 

Qy 2339 gatgcagaggtgataataatcttaatttgatgcagaggtaataatgggccgggtttgagc 2398 

I I I I I I! III! Ill III 

Db 5226 TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT 5167 

Qy 2399 tggacttaagcatgatattgacgtactttatatttttccaaattcaacccagctcgaaat 2458 

II! I I II I III I lllll II I I 

Db 5166 TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT 5107 

Qy 2459 atgagtctaaaattttgtccaatttaatccaagcccattttaagttcgtccatattattt 2518 

I I I llll I III I llll III I II III 

Db 5106 TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT 5047 

Qy 2519 tttaatttaaaaaatttatatcattttattttaatatttaattattttatatatttttta 2578 

III II! II! I I llll llll I II! II III! I I MM 

Db 5046 TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT 4987 

Qy 2579 tttattgaaaatttttatatagtcatcttaacattatgttaatgtttatattagagtagt 2638 



Db 4986 TTTTTTTTTTTTTTTwilTTTTTTTTm 4927 
Qy 2639 attatatatatttagtataggtttattttgttaataaacttaaaaatgggtcttgtgg 2696 
Db 4926 TTTTTTTTTTTTTTT™ 4869 



RESULT 12 
X33184/C 

ID X33184 standard; DNA; 7996 BP. 

AC X33184; 

DT 25-JON-1999 (first entry) 

DE Base sequence of the plasmid pRx-Bcl 2-i-hCD 25. 

KW Cowpox virus; bsr; viral vector; expression; apoptosis; resistance; 

KW crmA; bcl-2; bcl-xl; FLIP; survivin; IAP; ILP; adenovirus; cancer; 

KW autoimmune disease; graft rejection reaction; inflammation; 

KW inflammatory disease; ss. 

OS Synthetic. 

OS Homo sapiens. 

PN WO9913073-A2, 

PD 18-MAR-1999. 

PF 07-SEP-1998; J04010. 

PR 08-SEP-1997; JP-259235. 

PA (RPRG-) RPR GENCELL ASIA PACIFIC INC. 

PI Hamada H; 

DR WPI; 99-243728/20. 

PT New apoptosis-resistant virus-sensitive cell 

PS Example 3; Page 46-49; 51pp; English. 

CC The present invention describes an apoptosis-resistant virus -sensitive 

CC cell line into which an apoptosis resistance gene has been introduced. 

CC The recombinant viruses generated are capable of expressing apoptos is - 

CC associated genes . These can then be used in a variety of diseases for 

CC which the induction of apoptosis by gene transfer, or where the 

CC inhibition of harmful apoptosis, is therapeutic. The recombinant viruses 

CC are useful as vectors for gene therapy which can be applied to cancer 

CC therapy for destroying cancer cells selectively, the treatment of 

CC autoimmune diseases and graft rejection reaction, and apoptosis induction 

CC therapy for inflammatory cells in inflammatory diseases. Prior arts have 

CC encountered the problem where if an adenovirus vector capable of- 

CC expressing an apoptosis-associated gene is introduced into animal cells, 

CC the cells producing the virus will be destroyed because the period of 

CC time required to induce cell death by apoptosis is shorter than *that 

CC required to replicate and produce the virus, resulting in failure to 

CC obtain a recombinant virus having the integrated apoptosis-associated 

CC gene. In this invention an apoptosis-resistant 293 cell line (having an 

CC apoptosis resistant gene introduced) is established and overcomes the 

CC problem, The present sequence represents the base sequence of the 

CC plasmid pRx-Bcl 2-i-hCD 25, which contains the human Bcl-2 gene, and 

CC is used in an example from the present invention. 

SQ Sequence 7996 BP; 2463 A; 2015 C; 1829 G; 1689 T; 



Query Match 3.0%; Score 90.8; DB 1; Length 7996; 

Best Local Similarity 45.4%; Pred. No, 0.00017; 

Matches 326; Conservative 0; Mismatches 392; Indels 0; Gaps 

Qy 1979 caattttcgcagtataagttccttttaatcctttctttttacttcattttataacgaatt 2038 

b 5785 CAATTCTTTTTTTTTTTTm 5726 

Qy 2039 ctatggataatgttccctacaaacatgtcattacaatgtttaattataaattccattctt 2098 

I I I I II I llll Mil II I II II II 
Db 5725 TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT 5666 

Qy 2099 ctattttactaagatattagtaacttcaaactgctgatttttactaatttattatttata 2158 

I llll I I II I II II lllll I III II III I 
Db 5665 TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT 5606 

Qy 2159 aattgttagaatgattatttttcaataatttaacaacaatatttaatattattattatta 2218 

II II I II lllll I III I III I II II II II 
Db 5605 TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT 5546 
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T73866/C 

ID T73866 standard; DNA; 3045 BP. 

AC T73866; 

DT 26-JAN-1998 (first entry) 

DE Cotton fibre promoter clone Racl3 construct, pCGN4735. 

KW promoter; fibre-specific; transcriptional factor; promoter; 

KW altered phenotype; colour; melanin; indigo; ss. 

OS Gossypium hirsutum cv. coker 130. 

PN WO9640924-A2. 

PD 19-DEC-1996. 

PF 07-JUN-1996; O09897. 

PR 07-JON-1995; US-480178. 

PR 01-JUL-1996; ZA-005572, 

PA (CALJ ) CALGENE INC. 

PI Mcbride K, Pear JR, Perez -Grau L, Stalker DM; 

DR WPI; 97-052325/05. 

PT DNA construct contg. gene of interest controlled by cotton fibre 

PT transcriptional factor ■ used to produce altered phenotype cotton 

PT fibre cells expressing genes affecting pigmentation 

PS Claim 23; Fig 5A-E; 95pp; English. 

CC The present sequence is the Racl3 promoter construct, pCGN4735, isolated 

CC from cotton fibre genomic clone 15-1. DNA constructs containing 

CC cotton fibre-specific transcriptional factor promoters are useful to 

CC produce cotton fibre cells with altered phenotype, especially altered 

CC colour, Genes involved in the production of melanin (e.g. tyrosinase 

CC gene and ORF438 encoded protein from Streptomyces antibioticus) and 

CC indigo (mono-oxygenase genes possibly in conjunction with a 

CC tryptophanase gene) are of interest. The promoters of the invention are 

CC reliable and permit expression of a protein selectively in cotton fibre 

CC to affect qualities such as fibre strength, length, colour and dyability 

CC as required. The construct and methods can also be used for the 

CC introduction of other advantageous genes into a cotton plant, e.g. a 

CC plant hormone, In particular, fibres from a plant producing coloured 

CC fibres may be used to produce yarns and/or fabrics that do not require 

CC dyeing, 

SQ Sequence 3045 BP; 1063 A; 450 C; 366 6; 1162 T; 



Query Match 3.1%; Score 94.8; DB 1; Length 3045; 

Best Local Similarity 51.7%; Pred. No. 5.8e-05; 

Matches 268; Conservative 2; Mismatches 234; Indels 14; Gaps 2; 

Qy 824 cataatgaagttgatgttttcttctttttaatatttttatacaaaatatttaaataaaat 883 

Mill II II II II II III II :IM I I I MM I 
Db 1334 CATAACTAACTTTTGGTGTTTCTATGTTTTTTAAYTTTGTTACTCTCACAATATTAAATT 1275 

Qy 884 aattaaggattgaatgaaaaatataatgaaagtcgttttactaatagtcatattgcattt 943 

I I I Mil III llllllll II II III I III III MM 
Db 1274 ATAAGTAAAATATGATAAAAGTATCATGAAAGTTTTTGTATTAAGATTCAAATTATATTT 1215 

Oy 944 tgtcgcatctacttaaataatagataaattaattgtggtacattagatcaaagaacaaac 1003 

I III llll llllllll! Illlllllll Mil! 

Db 1214 TACTTATTTCTTTAGGAAAATAAGTAAATTAATCCCTACGTGTTAGATCAAATAACAA-- 1155 

Qy 1004 tagattttgtcccattctattgttaaaagctggtccgtttacattaaaataaggtacatg 1063 

iiii iii mini iii n ii I ii ii i ii 

Db 1156 TTTGACCCTTTCTATTTAAAAATTACATTCATTTCTACTGTTAAAAATTAGCGTG 1102 

Qy 1064 ttacatgccacgtataactatctggttattctatcaatcacgctaatttttaacagtaga 1123 

llll llll I II I II I II II I I 
Db 1101 ATTGATAGAATAACCAGATAGTTATACGTGGCATGTAACATGTACCTTATTTTAATGTAA 1042 

Qy 1124 aatgaatgtaatttttaaatagaaagggtcaaa ttgttatttgatctaacacg 1176 

i ii m mini iii iiii inn iiiimni 

Db 1041 ACGGACCAGCTTTTAACAATAGAATGGGACAAAATCTAGTTTGTTCTTTGATCTAATGTA 982 

Qy 1177 tagggattaatttacttattttcctaaagaaataagtaaaatataatttgaatcttaata 1236 

llllllll llll I I I Hill III III I III II 
Db 981 CCACAATTAATTTATCTATTATTTAAGTAGATGCGACAAAATGCAATATGACTATTAGTA 922 

Qy 1237 caaaaactttcatgatacttttatcatattttacttataatttaatattgtgagagtaac 1296 



II llllllll III llll I I II llll I I I 
Db 921 AAACGACTTTCATTATATTTTTCATTCAATCCTTAATTATTTTATTTAAATATTTTGTAT 862 

Qy 1297 aaarttaaaaaacatagaaacaccaaaagttagttatg 1334 

III: II III II II II II II lllll 
Db 861 AAAAATATTAAAAAGAAGAAAACATCAACTTCATTATG 824 



X33181/C 

ID X33181 standard; DNA; 6644 BP. 

AC X33181; 

DT 25-JON-1999 (first entry) 

DE Base sequence of the plasmid pRx*ires-bsr. 

KW Cowpox virus; bsr; viral vector; expression; apoptosis; resistance; 

KW crmA; bcl-2; bcl-xl; FLIP; survivin; IAP; ILP; adenovirus; cancer; 

KW autoimmune disease; graft rejection reaction; inflammation; 

KW inflammatory disease; ss. 

OS Synthetic. 

OS Cowpox virus. 

PN WO9913073-A2. 

PD 18-MAR-1999. 

PF 07-SEP-1998; J04010. 

PR 08-SEP-1997; JP-259235. 

PA (RPRG-) RPR GENCELL ASIA PACIFIC INC. 

PI Hamada H; 

DR WPI; 99-243728/20. 

PT New apoptosis -resistant virus-sensitive cell 

PS Example 1; Page 38-41; 51pp; English, 

CC The present invention describes an apoptosis-resistant virus -sensitive 

CC cell line into which an apoptosis resistance gene has been introduced. 

CC The recombinant viruses generated are capable of expressing apoptosis - 

CC associated genes. These can then be used in a variety of diseases for 

CC which the induction of apoptosis by gene transfer, or where the 

CC inhibition of harmful apoptosis, is therapeutic. The recombinant viruses 

CC are useful as vectors for gene therapy which can be applied to cancer 

CC therapy for destroying cancer cells selectively, the treatment of 

CC autoimmune diseases and graft rejection reaction, and apoptosis -induction 

CC therapy for inflammatory cells in inflammatory diseases. Prior arts have 

CC encountered the problem where if an adenovirus vector capable of 

CC expressing an apoptosis -associated gene is introduced into animal cells, 

CC the cells producing the virus will be destroyed because the period of 

CC time required to induce cell death by apoptosis is shorter than«that 

CC required to replicate and produce the virus, resulting in failure to 

CC obtain a recombinant virus having the integrated apoptosis -associated 

CC gene. In this invention an apoptosis-resistant 293 cell line (having an 

CC apoptosis resistant gene introduced) is established and overcomes the 

CC problem. The present sequence represents the base sequence of the 

CC plasmid pRx-ires-bsr, which contains the cowpox virus bsr gene, and 

CC is used in an example from the present invention, 

SQ Sequence 6644 BP; 2166 A; 1573 C; 1424 G; 1481 T; 



Query Match 3.0%; Score 90,8; DB 1; Length 6644; 

Best Local Similarity 45.4%; Pred. No. 0.00017; 

Matches 326; Conservative 0; Mismatches 392; Indels 0; Gaps 

Qy 1979 caattttcgcagtataagttccttttaatcctttctttttacttcattttataacgaatt 2038 

Db 4433 CAATTCTTTTTTTTTTTTTro 4374 

Qy 2039 ctatggataatgttccctacaaacatgtcattacaatgtttaattataaattccattctt' 2098 

I I I I II I llll Mil II I II II II 
Db 4373 TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT 4314 

Qy 2099 ctattttactaagatattagtaacttcaaactgctgatttttactaatttattatttata 2158 

Db 4313 TTTTTTTTTTTTTTTTTTTTTTTTm 4254 

Qy 2159 aattgttagaatgattatttttcaataatttaacaacaatatttaatattattattatta 2218 

II II I II lllll I III I III I II II II II 
Db 4253 TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT 4194 
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Qy 783 atggaagggaaatttgagagtaagttcatgtttatattatacataatgaagttgatgttt 842 

II I III II I II II Mil I I I II IMII 
Db 300 ATATCAAATAAAACACTTTTTATAATAATACGAAAAATATATTTCCTTATTTTTATGTTT 359 

Qy 843 tcttctttttaatatttttatacaaaatatttaaataaaat — aattaaggattgaat 898 

ii inn ii inn i i i mi n n in n n 

Db 360 TCAAAATTTTAGTAGACTTATAATATTATTATGGATAACATTAACAAATAAAAATATTAT 419 

Qy 899 gaaaaatataatgaaagtcgttttactaatagtcatattgcattttgtcgcatctactta 958 

ii i iii in i mi ii ii ii ii ii ii ii 

Db 420 GAGTATAATATGTAAATTATTTTTTTTTTTTTACAGTTTATATGTTTATGAACATATAAT 479 
Qy 959 aataatagataaattaattgtggtacattagatcaaagaacaaactagattttgtcccat 1018 

i i inn iii mi ii ii ii mi i n 

Db 480 GTGATAAATAAAATTGATTAATTATTATTATATATAATTACTCTTGTAATTTATTAAAAT 539 

Qy 1019 tctattgttaaaagctggtccgtttacattaaaataaggtacatgttacatgccacgtat 1078 

III I I I I I Ml I I I II I II 
Db 540 GGT ATATT ATATATATATAT AT AATTTTTTTTATATTATTTGAATAAAAATATTA - - AAT 597 

Qy 1079 aactatctggttattctatcaatcacgctaatttttaacagtagaaatgaatgtaatttt 1138 

II II Ml II I lllll III I llll I III II I III 
Db 598 AAAAATTTTGTGTTTGGGTAAATCATAATAAGTGCTAACGTTCATAATTTATCTCATTAA 657 

Qy 1139 taaatagaaagggtcaaattgttatttgatctaacacgtagggattaatttacttatttt 1198 

MINIMI I II I lllll | II II II II II I II 
Db 658 AAAATAGAAATG • -AAATATAATATTTACGACAGTACATATATATATATGTATATTATTA 715 

Qy 1199 cctaaagaaataagtaaaatataatttgaatcttaatacaaaaactttcatgatactttt 1258 

III I I II III I II II I III III I I I II II III I I 
Db 716 AAAAAAAATAAAAATAACACAT-ATATATATATATATATATATATATTGATAATATATAT 774 

Qy 1259 atcatattttacttataatttaatattgtgagagtaacaaarttaaaaaacatagaa 1315 

llll III II I I Mill Mllll I III II 
Db 775 GTTTTAAGTATGGATAAATCAAAAAGTTCCATAGAGAAAGAATTAAATAGGATAAAA 831 



RESULT 6 
T72882 

ID T72882 standard; cDNA; 19124 BP. 
AC T72882; 

DT 12-SEP-1997 (first entry) 
DE Plasmodium var-7 gene. 

KW DSL gene family; SABP; sialic acid binding protein; vaccine; therapy; 
KW Duffy binding like gene; Duffy antigen binding protein; erythrocyte; 
KW DABP; merozoite; malaria; var-1; var-2; var-3; var-7; immune response; 
KW Plasmodium; ss. 
OS Plasmodium vivax. 



OS 



FH 


Key 


Location/Qualifiers 




FT 


exon 


7317. .15139 


Qy 


FT 




/♦tag- a 




FT 




/number- 1 


Db 


FT 


intron 


15140. .16205 




FT 




/*tag- b 


Qy 


FT 




/number- 1 




FT 


exon 


16206. .17552 


Db 


FT 




/♦tag- c 




FT 




/number- 2 


Qy 


FT 




/note- "no stop codon given" 




PN 


WO9640766-A2 




Db 


PD 


19-DEC-1996. 






PF 


07-JUN-1996; 


U09508. 


Qy 


PR 


07-JUN-1995; 


US-487826. 




PA 


(CSSH ) US DEPT HEALTH S HUMAN SERVICES. 


Db 


PI 


Chitnis C, 


Miller LH, Peterson DS, Sim KL, Su X; 




PI 


Wellems TE; 




Qy 


DR 


WPI; 97-052231/05. 




DR 


P-PSDB; W22475. 


Db 


PT 


New malaria vaccines - contains cysteine-rich DBL family protein 




PT 


binding domains homologous domains of the Duffy and sialic acid 


Qy 


PT 


binding proteins 





PS Claim 4; Page 56-61; 96pp; English. 

CC This sequence represents the var-7 gene of Plasmodium. Var-7 belongs to 

CC the Duffy binding like (DBL) family of genes which have homology to the 

CC Duffy antigen binding protein (DABP) and sialic acid binding protein 

CC (SABP) conserved regions (see T72889 and T72888 respectively). The var 

CC family of genes modulate cytoadherence and antigenic variation of 

CC Plasmodium infected erythrocytes . SABP and the Duffy antigen binding 

CC protein (DABP) are soluble proteins that appear in the culture 

CC supernatant after infected erythrocytes release merozoites. DABP and SABP 

CC mediate the binding of merozoites and schizonts to the erythrocyte 

CC surface. These proteins are necessary for erythrocyte invasion by the 

CC parasite. This sequence can be used in the compositions of the invention. 

CC The compositions are for the treatment and prevention of malaria, and 

CC comprise either a nucleotide sequence or encoded polypeptide of the 

CC var-l, var-2, var-3 or var-7 genes of the DBL gene family, a family of 

CC genes having homology with conserved regions of DABP and SABP. The 

CC compositions are used for the treatment and prevention of malaria, They 

■ CC are also used in the preparation of vaccines for inducing a protective 

CC immune response in a mammal to Plasmodium merozoites (especially 

CC Plasmodium falciparum or Plasmodium vivax) . 

SQ Sequence 19124 BP; 7824 A; 2190 C; 2790 G; 6320 T; 



Query Match 3.2%; Score 96.2; DB 1; Length 19124; 

Best Local Similarity 45.1%; Pred. No. 3.2e-05; 

Matches 591; Conservative 0; Mismatches 709; Indels 10; Gaps 6; 

Qy 138 aatataataaatacatcgtagaaataaattttattcaaattgaagtct---taaccatct 194 

MM I II II I H 1 1 1 1 1 1 III II I III I II II I 
Db 5418 AATACGTAACATGTATTATAGAAATAATAAGAATTTAATATTAAGGATAAATATAAATAT 5477 

Qy 195 ttaatatttgtagatgtaatttaaatgaaagataaatacatattcttggacatgtatttt 254 

llll III I I II MM I I II II II lllll I I 
Db 5478 TTAAAATTATATTTTTTTATGTCAATTTATGTTATATTATATTATATTAACATGATTAGT 5537 

Qy 255 catcttaatgtttgtggctttggtgataggtgtattgatgtacgatgtcttttaaatcac 314 

I I II I I I III II II II III 

Db 5538 TTTTTGAAAAATATTTAAATATCATATAATAATAATAAATTAGTTAAAATAATAGTATT1 5597 

Qy 315 atatcacattttgagtttgtatgatgataagtcgacataancgaaatatggtgtgatctt 374 

II I I I II II I III Ml I Mill Ml f 
Db 5598 CATACAAAATACTAACTTATAAGTATATCATATAATATTATATATATATATATTTATGTG 5657 

Qy 375 cacttttgaactttgataagtcaccaaactttaacaaagtttg-'-attgtgtacatata 431 

II I III II II II llll III I II llll 
Db 5658 TTTTTGATTGGGTGTATATAAGGCTATAAGTATATATGGGTTGTTCATTATATATTTATA 5717 

Qy 432 tatatatatcttcaaattttataataaaaattgtgtttaaataatttacagttatattat 491 

I I III I II II llll I I II I I I III I llll II 
Db 5718 TGTGAATAGATACATATAAGTTAAT-ATATTTATTTGTGTATATGTCTGTGTTAAGATAG 5776 

492 ttttttatctctaattttatttgtcgccaaatttttagttgatattttaacataaaaaaa 551 

5777 ATATGCATTACAGTTAAGGGTTATAGTTTTTTm 5836 

552 attgtacacatttacaagcccatatacaaataattatataaatattcattaaaaaatata 611 

lllll I lllll II III llll III llll lllll 
5837 AATAGATAACTAACAATATGCATATTACAAGAATAATATTTGTATAAAAT-ATATATATA 5895 

612 tttaaatataggatataaatataactattttagaattattctactttaagataacatagg 671 

• i ii inn i i ii linn ii i ii iii i i 

5896 TATATATATATAAAGACATTAAAACTATACTAATAGGTAATTAGTTTTATTATATCATCC 5955 
672 ttaaatgtataattaataaggttagtttattgtaaagatgagtatatatgtcgtaaacat 731 
5956 TTTTATTATTATMTTTTTTTTGTTTTACTOT 6015 

732 aatcactaaccatttttattaacttcttggttttgaagttccaaaaagaaaatggaaggg 791 

II Ml I III I I II II II I Mill 

6016 AACAAATATAAAACAATATCAGTATTTGGAATATAAATAMTTTATTCTACATATATGCA 6075 

792 aaatttgagagtaagttcatgtttatattatacataatgaagttgatgttttcttctttt 851 

II I I I I I Mil III I I Ml I Ml 
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Sequence 1864 BP; 786 A; 210 C; 44 G; 



Query Match 3.6*; Score 109.6; DB 1; Length 1864; 

Best Local Similarity 44.7%; Pred. No. 8,3e-07; 

Matches 559; Conservative 0; Mismatches 674; Indels 18; Gaps 6; 

Qy 4 gatgagaaccaatttttaatagtaaancctaaccaatttttaataataaagctgactcct 63 , 

II I II Mil I III II III I II II I I II II 
Db 1608 GAATAAAAAAAATTATNTCTAGATAATATTAAATATTTAGAAAATAAGATATATATTTCT 1549 

Qy 64 agtacaagagcttttattcattcttctattttgctttcctctaggcttggcaatcgagaa 123 

I I II II Mil II II I II I 

Db 1548 AATNTNTTAGGAAGNTNTATCTATTCTTCTTAATAATCTTAATAATTTTTTATAGAGNTT 1489 

Qy 124 ttttcttgtgttacaatataataaatacatcgtagaaataaattttattcaaattgaagt 183 

I II II II I III I III I I I ! I II II III 
Db 1488 TATTATTNNNNTAGAGTATTTTTTATAGTTTCTTAAGTAATTTATTNTTAGNTTAGTTAT 1429 

Qy 184 cttaaccatctttaatatttgtagatgtaatttaaatgaaagataaatacatattcttgg 243 

III I II III I I II I I I I I lllll I 
Db 1428 TATAATTAATAATATTATAATAAAGTTTAGTAANAGTTTNAAGGTCAANTATATTTATAT 1369 

Qy 244 acatgtattttcatcttaatgtttgtggctttggtgataggtgtattgatgtacgatgtc 303 
I I III llll I I I I II I II I II II III 

Db 1368 ATTTTAATTAATTAATTAAATTNTTTAGTATTANTAAGATTTAAGTTATAGTTAGATAAA 1309 

Qy 304 tttta-aatcacatatcacattttgagtttgtatgatgataagtcgacataancgaaata 362 

I II I llll I I lllll III II II 
Db 1308 TACTATTNTTTAATATTATTNAGTTTTAGTAGNTATTTTTTATATATTTTAAGTTAATTA 1249 

Qy 363 tggtgtgatcttcacttttgaactttgataagtcaccaaactttaacaaagtttgattgt 422 

I II I I I I III II I I I I II I I II 
Db 1248 TTGTNTANNNTGGNTAGTATATCTTATTTATAGTATAATAATATATNTAGGNAAAGNTGC 1189 

Qy 423 gtacatatatatatatatcttcaaattttataataaaaattgtgtttaaataatttacag 482 

I II II II II llll II I II llll II II I Ml 
Db 1188 TAGCTTANGTAGTTAATATTTAAAATAAAATTAAAATAATTAAGTATATCTGTNTAATAT 1129 

Qy 483 ttatattatttttt-tatctctaattttatttgtcgccaaatttttagttgatatttta 540 

I II lllll II II lllllll II I I II I Ml II 
Db 1128 ATTTANTATTTATTAAAATATCTAATTATAGTAAAGTATAGTAGGTTTTTATAATTTATA 1069 

Qy 541 acataaaaaaaattgtacacatttacaagcccatatacaaataattatataaatattcat 600 

I I II llllll I I I III III II I I II II I llll II 
Db 1068 ATAATAATAAAATTTTTTAAAGTTATAAGNTNTAATTTNTAAANTTTTA-AGCTATTAAT 1010 

Qy 601 taaaaaatatatttaaatataggatataaatataactattttagaattattctactttaa 660 

I I III II II II III llllll I I I II II 
Db 1009 AATTANTAATAATTTTTATTATAAATTTTATAATTATATTTTGCAGGGAGAGTTGTTAAA 950 

Qy 661 gataacataggttaaatgtataattaataaggttagtttattgtaaagatgagtatatat 720 

I I III II II II II I III I llll III II I I II 
Db 949 TAATAAATATAAAAATATAGTAGTTTATTA- - TTATTATATTAT AATAATATTTTTNTAG 892 

Qy 721 gtcgtaaacataatcactaaccatttttattaacttcttggttttgaagttccaaaaaga 780 

I llll I I II III lllll II II III 
Db 891 TAGTGTAGTTTAATATAAGAGCTTTAAAAATTATTTCTTAATTAGGATTAAATTATATAA 832 

Qy 781 aaatggaagggaaatttgagagtaagttcatgtttatattatacataatgaagttgatgt 840 

llll llll I I I I I I II I II I II II II II II I 
Db 831 AMTAGAAGTTATIATAAATMTIAATTTAGGTATTTAAANTAGGTATACTTNTTAATNT 772 

Qy 841 tttcttctttttaatatttttatacaaaatatttaaataaaataattaaggattgaatga 900 

I III I IMMII II III II II I I I I I 
Db 771 ATAAAAATTTAATAATTTTTTATATTAGAATAGTAAGTAGGTATATNAGGTCTAAGAAGT 712 

Qy 901 aaaatataatgaaagtcgttttactaatagtcatattgcattttgtcgcatctacttaaa 960 

llllll II II III I I I I I I II I II III 
Db 711 TTTATATAAATTAATAAAAATTTATAAATNTTAAAAAGAAGTATATTTTTTTTAAGTAAT 652 

Qy 961 taatagataaattaattgtggtacattagatcaaagaacaaactagattttgtcccattc 1020 

I III llll II II II I III I I II 



Db 651 ATAAGGATTTATTATTTAAGGATAGCTTATTTTCTTAG-- 



-GATCTATTTATATAN 599 



Qy 1021 tattgttaaaagctggtccgtttacattaaaataaggtacatgttacatgccacgtataa 1080 

llll I I I I I I II II I I I II I III I 

Db 598 NGTTNTTTATATTAATAATAAATCTTAAAGACTTAGTTAATTTTAAAATTATATTTATTA 539 

Qy 1081 ctatctggttattctatcaatcacgctaatttttaacagtagaaatgaatgtaattttta 1140 

III I III I I II III II I III II lllll I I 
Db 538 ATATTAAGGTATAGTNNTGCTTTGNNTAGTTTNTAGAAAATTAATTATATATAATTATAA 479 

Qy 1141 aa tagaaagggtcaaattgttatttgatctaacacgtagggattaatttacttat 1195 

I I I I I I I I I II II II I II I II I II 
Db 478 TACTTTTTTAATAAGNTATIAGTAAAAATTAATATATTATCTAAGTATATAAAGGNAAAT 419 

Qy 1196 tttcctaaagaaataagtaaaatataatttgaatcttaatacaaaaacttt 1246 

II III I I II llll II II II II II II II I II 
Db 418 TTATCTAGAAAGTTATGTAAGATNTAGTTAATATAATATTAGAATATAGTT 368 



RESULT 4 
T72882/C 

ID T72882 standard; CDNA; 19124 BP. 
AC T72B82; 

DT 12-SEP-1997 (first entry) 
DE Plasmodium var-7 gene. 

KW DBL gene family; SABP; sialic acid binding protein; vaccine; therapy; 
KW Duffy binding like gene; Duffy antigen binding protein; erythrocyte; 
DABP; merozoite; malaria; var-1; var-2; var-3; var*7; immune response; 
Plasmodium; ss. 
Plasmodium vivax. 
Plasmodium falciparum. 



KW 



Key 
exon 



intron 



exon 



Location/Qualifiers 
7317. .15139 
/♦tag- a 
/number- 1 
15140. ,16205 
/♦tag- b 
/number- 1 
16206. .17552 
/♦tag- c 

/number- 2 * 

/note- "no stop codon given" 
WO9640766-A2. "C 
19-DEC-1996. . 
07-JUN-1996; U09508. 
07-JUN-1995; US-487826. 
(OSSH ) US DEPT HEALTH S HUMAN SERVICES. 

Chitnis C, Miller LH, Peterson DS, Sim KL, Su X; 

Wellems TE; 

WPI; 97-052231/05. 

P-PSDB; W22475. 

New malaria vaccines ■ contains cysteine-rich DBL family protein 
binding domains homologous domains of the Duffy and sialic acid 
binding proteins 

Claim 4; Page 56-61; 96pp; English. 

This sequence represents the var-7 gene of Plasmodium. Var-7 belongs to 
the Duffy binding like (DBL) family of genes which have homology to the 
Duffy antigen binding protein (DABP) and sialic acid binding protein 
(SABP) conserved regions (see T72889 and T72888 respectively). The var 
family of genes modulate cytoadherence and antigenic variation of 
Plasmodium infected erythrocytes. SABP and the Duffy antigen binding 
protein (DABP) are soluble proteins that appear in the culture 
supernatant after infected erythrocytes release merozoites , DABP and SABP 
mediate the binding of merozoites and schizonts to the erythrocyte 
surface. These proteins are necessary for erythrocyte invasion by the 
parasite, This sequence can be used in the compositions of the invention. 
The compositions are for the treatment and prevention of malaria, and 
comprise either a nucleotide sequence or encoded polypeptide of the 
var-1, var-2, var-3 or var-7 genes of the DBL gene family, a family of 
genes having homology with conserved regions of dabp and SABP. The 
compositions are used for the treatment and prevention of malaria. They 
are also used in the preparation of vaccines for inducing a protective 
immune response in a mammal to Plasmodium merozoites (especially 



Tue Sep 5 07:23:00 2000 



us-08-984-1 



■099-15. rag 



Page 



181 AGTCTTAACCATCTTTAATATTTGTAGATGTAATTTAMTGAAAGATAAATACATATTCT 240 

241 tggacatgtattttcatcttaatgtttgtggctttggtgataggtgtattgatgtacgat 3 00 

241 TGGACATGTATTTTCATCTTAATGTTTGTGGCTTTGGTGATAGGTGTATTGATGTACGAT 300 

301 gtcttttaaatcacatatcacattttgagtttgtatgatgataagtcgacataancgaaa 3 60 

301 GTCTTTTAAATCACATATCACATTTTGAGTTTGTATGATGATAAGTCGACATAANCGAAA 360 

361 tatggtgtgatcttcacttttgaactttgataagtcaccaaactttaacaaagtttgatt 420 

361 TATGGTGTGATCTTCACTTTTGAACTTTGATAAGTCACCAAACTTTAACAAAGTTTGATT 420 

421 gtgtacatatatatatatatcttcaaattttataataaaaattgtgtttaaataatttac 480 

421 GTGTACATATATATATATATCTTCAAATTTTATAATAAAAATTGTGTTTAAATAATTTAC 480 

481 agttatattatttttttatctctaattttatttgtcgccaaatttttagttgatatttta 540 

481 AGTTATATTATTTTTTTATCTCTAATTTTATTTGTCGCCAAATTTTTAGTTGATATTTTA 540 

541 acataaaaaaaattgtacacatttacaagcccatatacaaataattatataaatattcat 600 

541 ACATAAAAAAAATTGTACACATTTACAAGCCCATATACAAATAATTATATAAATA1TCAT 600 

601 taaaaaatatatttaaatataggatataaatataactattttagaattattctactttaa 660 

601 TAAAAAATATATTTAAATATAGGATATAAAIATAACTATTTTAGAATTATTCTACTTTAA 660 

661 gataacataggttaaatgtataattaataaggttagtttattgtaaagatgagtatatat 7 20 

661 GAIAACATAGGTTAAATGTATAATTAATAAGGTTAGTTTATTGTAAAGATGAGTATATAT 720 

721 gtcgtaaacataatcactaaccatttttattaacttcttggttttgaagttccaaaaaga 7 80 

721 GTCGTAAACATAATCACTAACCATTTTTATTAACTTCTTGGTTTTGAAGTTCCAAAAAGA 780 

781 aaatggaagggaaatttgagagtaagttcatgtttatattatacataatgaagttgatgt 840 

781 AAATGGAAGGGAAATTTGAGAGTAAGTTCATGTTTATATTATACATAATGAAGTTGATGT 840 

841 tttcttctttttaatatttttatacaaaatatttaaataaaataattaaggattgaatga 900 

841 TTTCTTCTTTTTAATATTTTTATACAAAATATT1AAATAAAATAATTAAGGATTGAATGA 900 

901 aaaatataatgaaagtcgttttactaatagtcatattgcattttgtcgcatctacttaaa 960 

901 AAAATATAATGAAAGTCGTTTTACTAATAGTCATATTGCATTTTGTCGCATCTACTTAM 960 

961 taatagataaattaattgtggtacattagatcaaagaacaaactagattttgtcccattc 1020 

961 TAATAGATAAATTAATTGTGGTACATTAGATCAAAGAACAAACTAGATTTTGTCCCATTC 1020 

1021 tattgttaaaagctggtccgtttacattaaaataaggtacatgttacatgccacgtataa 1080 

1021 TATTGTTAAAAGCTGGTCCGTTTACATTAAAATAAGGTACATGTIACATGCCACGTATAA 1080 

1081 ctatctggttattctatcaatcacgctaatttttaacagtagaaatgaatgtaattttta 1140 

1081 CTATCTGGTTATTCTATCAATCACGCTAATTTTTAACAGTAGAAATGAATGTAATTTTTA 1140 

1141 aatagaaagggtcaaattgttatttgatctaacacgtagggattaatttacttattttcc 1200 

1141 AATAGAAAGGGTCAAATTGTTATTTGATCTAACACGTAGGGAITAATTTACTTATTTTCC 1200 

1201 taaagaaataagtaaaatataatttgaatcttaatacaaaaactttcatgatacttttat 1260 

1201 TAAAGAAATAAGTAAAATATAATTTGAATCTTAATACAAAAACTTTCATGATACTTTTAT 1260 

1261 catattttacttataatttaatattgtgagagtaacaaarttaaaaaacatagaaacacc 1320 



Db 


1261 


Qy 


1321 


Db 


1321 


Qy 


1381 


Db 


1381 


Qy 


1441 


Db 


1441 


Qy 


1501 


Db 


1501 


Qy 


1561 


Db 


1561 


Qy 


1621 


Db 


1621 


Qy 


1681 


Db 


1681 


Qy 


1741 


Db 


1741 


Qy 


1801 


Db 


1801 


Qy 


1861 


Db 


1861 


Qy 


1921 


Db 


1921 


Qy 


1981 


Db 


1981 


Qy 


2041 


Db 


2041 


Qy 


2101 


Db 


2101 


Qy 


2161 


Db 


2161 


Qy 


2221 


Db 


2221 


Qy 


2281 


Db 


2281 


oy 


2341 


Db 


2341 



CATATTTTACTTATAATTTAATATTGTGAGAGTAACAAARTTAAAAAACATAGAAACACC 1320 
aaaagttagttatggtgtgactcatatacacagttaaaatttgaataaatttttttcttc 1380 
AAAAGTTAGTTATGGTGTGACTCAIAIACACAGITAAAAITTGAATAAATTTTTTTCTTC 1380 
gtcattaattccatcatgggtttttttttttctagttaagccataattatcaaaataatc 1440 
GTCAITAATTCCATCATGGGTTTTTTTTTTTCTAGTTAAGCCATAATTATCAAAATAATC 1440 
atcattaatcctatcaataccccgccctgcctccctccctcaatacttaaacccaactaa 1500 
ATCATTAATCCTATCAATACCCCGCCCTGCCTCCCTCCCTCAATACTTAAACCCAACTAA 1500 
cacccagcaccaaacgcactttaatagccacctatttctagccatgtccttgcacttaaa 1560 
CACCCAGCACCAAACGCACTTTAATAGCCACCTATTTCTAGCCATGTCCTTGCACTTAAA 1560 
gaaaagtaaagctaacctgcaatcattccatatcgaggcctcaacagataaagttggttg 1620 
GAAAAGTAAAGCTAACCTGCAATCATTCCATATCGAGGCCTCAACAGATAAAGTTGGTTG 1620 
atgggtttgcaccaagttgttaaaacccggccctcaac ttcccttttcttttcatcctcc 1680 
ATGGGTTTGCACCAAGTTGTTAAAACCCGGCCCTCAACTTCCCTTTTCTTTTCATCCTCC 1680 
ccactccacaccctccaattttcttcatatggttctattataagttctttataatcacag 1740 
CCACTCCACACCCTCCAATTTTCTTCATATGGTTCTATTATAAGTTCTTTATAATCACAG 1740 
aatcaagataagtcctcagcaaacaaaaaaccatggctctcgagcaagatctggactagt 1800 
AATCAAGATAAGTCCTCAGCAAACAAAAAACCATGGCTCTCGAGCAAGATCTGGACTAGT 1800 
cagagctctgaatattggatcattattacagtcaaaaacagttaacaaaagctgttgcag 1860 
CAGAGCTCTGAATATTGGATCATTATTACAGTCAAAAACAGTTAACAAAAGCTGTTGCAG 1860 

ataaacactgaatctgctatagtttgtttttggtttacatatgttccacgtgaaactatg 1920 

ATAAACACTGAATCTGCTATAGTTTGTTTTTGGTTTACATATGTTCCACGTGAAACTATG 1920 
aagcatctctaagaaaacccaaactatcatatcaacccatcgatcaatgaatcgatttca 1980 
AAGCATCTCTAAGAAAACCCAAACTATCATATCAACCCATCGATCAATGAATCGATTTCA 1980 
attttcgcagtataagttccttttaatcctttctttttacttcattttataacgaattct 2040 
ATTTTCGCAGTATAAGTTCCTTTTAATCCTTTCTTTTTACTTCATTTTATAACGAATTCT 2040 
atggataatgttccctacaaacatgtcattacaatgtttaattataaattccattcttct 2100 
ATGGAIAATGTTCCCTACAAACATGTCATIACAATGTTTAATTATAAATTCCATTCTTCT 2100 
attttactaagatattagtaacttcaaactgctgatttttactaatttattatttataaa 2160 
ATTTTACTAAGATATTAGTAACTTCAAACTGCTGAT1TTTACTAATTIATTATTTATAAA 2160 
ttgttagaatgattatttttcaataatttaacaacaatatttaatattattattattatt 2220 
TTGTTAGAATGATTATTTTTCAATAATTTAACAACAATATTTAATATTATTATTATTATT 2220 
atttctcaatttttattaaacaaaaacataaatttttgacaaattaaaataaatgaatta 2280 
ATTTCTCAATTTTTATTAAACAAAAACATAAATTTTTGACAAATTAAAATAAATGAATTA 2280 
atttctcaatttttcgtgcaactattacaaaaatccttcatagtcctaatcttaatttga 2340 
ATTTCTCAATTTTTCGTGCAACTATTACAAAAATCCTTCATAGTCCTAATCTTAATTTGA 2340 
tgcagaggtgataataatcttaatttgatgcagaggtaataatgggccgggtttgagctg 2400 
TGCAGAGGTGATAATAATCTIAATTTGATGCAGAGGTAATAATGGGCCGGGTTTGAGCTG 2400 
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Db 139996 TTCATCTATATMAMTCATATTAATTATATTCATTTCTATTATATATATATCTAATTCA 140055 

Qy 2859 attgagcttaattaatatt 2877 

III I I IN I 
Db 140056 ATTAATAACATATAATAAT 140074 



RESULT 15 

AC005504 

LOCOS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 
ORGANISM 

REFERENCE 
AUTHORS 

TITLE 
JOURNAL 



HTG 01-APR-1999 
►* SEQUENCING IN PROGRESS 



TITLE 
JOURNAL 



AC005504 104992 bp DNA 
Plasmodium falciparum chromosome 12, 
***, 3 unordered pieces, 
AC005504 

AC005504.3 61:4558584 
HTG; HTGS_PHASE1 . 

malaria parasite P. falciparum. 
Plasmodium falciparum 

Eukaryota; Alveolata; Apicomplexa; Haemosporida; Plasmodium. 

1 (bases 1 to 104992) 

Hyman,R,W,, Fung,E.L,, Qin,F. , Tamaki,T., Kurdi,0.B., Conway,A.B, 
and Davis, R.W. 

Plasmodium falciparum 3D7 chromosome 12 
Unpublished 

2 (bases 1 to 104992) 

Hyian,R.W., Qin,F., Fung,E.L., Conway, A. B. and Davis, R.W. 
Direct Submission 

Submitted (21-AUG-1998) Stanford DNA Sequencing and Technology 
Center, Stanford University, 855 California Avenue, Palo Alto, CA 
94304, USA 

On Apr 2, 1999 this sequence version replaced gi:4337172. 

* NOTE: This is a 'working draft' sequence. It currently 

* consists of 3 contigs . The true order of the pieces 

* is not known and their order in this sequence record is 

* arbitrary. Gaps between the contigs are represented as 

* runs of N, but the exact sizes of the gaps are unknown. 

* This record will be updated with the finished sequence 

* as soon as it is available and the accession number will 

* be preserved. 

* 1 58642: contig of 58642 bp in length 

* 58643 58842: gap of unknown length 
91011: contig of 32169 bp in length 
91211: gap of unknown length 

104992: contig of 13781 bp in length. 
Location/Qualifiers 
1, .104992 

/organism-'Plasmodium falciparum" 
/db_xref-"taxon:5833" 
/chromosome-" 12" 
44286 a 9326 c 9564 g 41411 t 405 others 



Query Match 5.14; Score 156.6; DB 41; Length 104992; 

Best Local Similarity 44.6%; Pred. No, 6,3e-09; 

Matches 1232; Conservative 0; Mismatches 1491; Indels 42; Gaps 14; 

Qy 22 atagtaaancctaaccaatttttaataataaagctgactcctagtacaagagcttttatt 81 

III III I I III I III I I Ml I II III 
Db 72473 ATAATAATCACATATTAATATAATATATATTTATTAATTTATAAAATAAATAAAAATATA 72532 

Qy 82 cattcttctattttgctttcctctaggcttggcaatcgagaattttcttgtgttacaata 141 

II I I I I II I II I I Mill II II I II 

Db 72533 TATATATAATTATATAATATATCAAATTAAATCATTATAAAATTTATTTAAAATATATTA 72592 

Qy 142 taataaatacatcgtagaaataaattttattcaaattgaagtcttaaccatctttaatat 201 

III II I II I III II I II II I llll I Mil 

Db 72593 AAATTAA-ATATATATATATTAATAAATAATTAAGTTAATTATTTAATAAATSAAAATAA 72651 

Qy 202 ttgtagatgtaatttaaatgaaagataaatacatattcttggacatgtattttcatctta 261 

I II II III I III Mill llll I I II I II I I 
Db 72652 TAATAAATTTAAATATTAATATAATTAAATTCATAATACACATTAATTAATAAAATATGA 72711 

Qy 262 atgtttgtggctttggtgataggtgtattgatgtacgatgtcttttaaatcacatatcac 321 



58843 
91012 
91212 



FEATURES 
source 



II II I I I I I II I II II I I II I 

Db 72712 ATATTAATATAAATAATAAATAGAAAAATATTAATACAATTTAAATATTAAATAAATAAA 72771 

Qy 322 attttgagtttgtatgatgataagtcgacataancgaaatatggtgtgatcttcactttt 381 

I I I I I III I I I II II II I II I I I 
Db 72772 AATATTATAATTTATAAATAATAAAATATTAATATAAATTAATTAATAATATATAATAAA 72831 

Qy 382 gaactttgataagtcaccaaactttaacaaagtttgattgtgtacatatatatatatatc 441 

I I I II I III I II II I III llll llll 

Db 72832 TTAATATAATTAAATTTAAAATATAAATTAATAAAAAAATAATACTAATATTAATATAAA 72891 

Qy 442 ttcaaattttataataaaaattgtgtttaaataatttacagttatattatttttttatct 501 

I llll II I I llll Ml III I I I III II II III 
Db 72892 ATAAAATAATAATAAATAAAT TTTAATTAAAATTAAATAATAAAATATTAATATAA 72947 

Qy 502 ctaattttatttgtcgccaaatttttagttgatattttaacataaaaaaaattgtacaca 561 

I I II I I I III I I I II III llll III II I 

Db 72948 ATTAATTAAATAATAATATAATAAATTAATTAAATAATAATATAATAAATTAATTAATTA 73007 

Qy 562 tttacaagcccatatacaaataattatataaatattcattaaaaaatatatttaaatata 621 

I I III HIM III HIM I II III I Mill I 
Db 73008 ACAATATTTAAATAATTAAATATAATAATATATATTAAACAATTAATTATTATAAATTAA 73067 

Qy 622 ggatataaatataactattttagaattattctactttaagataacataggttaaatgtat 681 

II II llll II I UN M Ml I II I 

Db 73068 AGAATTATTAATAATAATATATAAATTAATTAAATAAAATGAATTAT TTA 73117 

Qy 682 aattaataaggttagtttattgtaaagatgagtatatatgtcgtaaacataatcactaac 741 

III I III I III II I I Mill I HIM I II III 

Db 73118 AATAATTAAAACAAT AAT ATATATAAATT AATTATATATTT AGTAAATAAAA — TAAA 73173 

Qy 742 catttttattaacttcttggttttgaagttccaaaaagaaaatggaagggaaatttgaga 801 

I I Mill II I III I I Mill llll II I I II I 

Db 73174 ATTAATAATTAAATTAATAATTTATAGTTATAAAAAATAAAAATAAATATATAATTAAAT 73233 

Qy 802 gtaagttcatgtttatattatacataatgaagttgatgttttcttctttttaatattttt 861 

II MM Ml HUM II I I I I I II III I I •* 

Db 73234 ATATAAATAAATTAAATAAATATATAAATAAATTAAATATATATACAATTAAAT-TAAAT 73292 

Qy 862 atacaaaatatttaaataaaataattaaggattgaatgaaaaatataatgaaagtcgttt 921 
Ml I III I II llll II llll III III Mill II II -" 
Db 73293 ATATATATAATTAAATTAAATATATATATAATTAAATTAAATATATATATAATTAAATTA 73352 

Qy 922 tactaatagtcatattgcattttgtcgcatctacttaaataatagataaattaattgtgg 981 

I Mill I I I III I I III I I II I I I 
Db 73353 AATTAATAAATAAAATAAATTAATAAATAAAATATATAATTAAATTAAATAAATATATAA 73412 

Qy 982 tacattagatcaaagaacaaactagattttgtcccattctattgttaaaagctggtccgt 1041 

I I I III II III II II I I I I I II II 
Db 73413 TTAAATTAATATATAAATAAATTAAATATATATATAATTAAATTATATTATATAMTTAA 73472 

Qy 1042 ttacat-taaaataaggtacatgttacatgccacgtataactatctggttattctatcaa 1100 

I I II III llll MM MM III II Ml 

Db 73473 TAATATATAATATAATAAATAAAATAATAATTAAATAATAATATATAATATAATAAATAA 73532 

Qy 1101 tcacgctaatttttaacagtagaaatgaatgtaatttttaaatagaaagggtcaaattgt 1160 

II llll llll II II III II I I I Ml I 

Db 73533 ACAATAATTAAATTAATAATATATTATAAATAATATAATAATTAAATATAATATAATAAT 73592 

Qy 1161 tatttgatctaacacgtagggattaatttacttattttcctaaagaaataagtaaaatat 1220 

II II llll I I MM II Ml I llll I MM 

Db 73593 TAATTAAATTAATA- - • ATATATTAAATAAAATAATAATAAATATTAAACAATTAAATAA 73649 

Qy 1221 aatttgaatcttaatacaaaaactttcatgatacttttatcatattttacttataattta 1280 

III llll I II Ml II I III III llll I 

Db 73650 AATATACATAATTAATATTAATAAATAATTATTATATTAAAATAATTAAAAAAATTAATT 73709 

Qy 1281 atattgtgagagtaacaaarttaaaaaacatagaaacaccaaaagttagttatggtgtga 1340 

I llll Ml II II II I III II II M Ml 
Db 73710 AATTTAAGATAATATATAATATTATATATAATTAATTATTAATAAATGTTTTATATTTAA 73769 

Qy 1341 ctcatatacacagttaaaatttgaataaatttttttcttcgtcattaattccatcatggg 1400 

I I llllll MM II II II UN llll 
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10319 


gap of 


10320 


10964 


contig 


10965 


11044 


gap of 


11045 


11648 


contig 


11649 


11728 


gap of 


11729 


12696 


contig 


12697 


12776 


gap of 


12777 


13976 


contig 


13977 


14056 


gap of 


14057 


15045 


contig 


15046 


.15125 


gap of 


15126 


15969 


contig 


15970 


16049 


gap of 


16050 


16859 


contig 


16860 


16939 


gap of 


16940 


17662 


contig 


17663 


17742 


gap of 


17743 


• 18767 


contig 


18768 


18847 


gap of 


18848 


19809 


contig 


19810 


19889 


gap of 


19890 


21046 


contig 


21047 


21126 


gap of 


21127 


21826 


contig 


21827 


21906 


gap of 


21907 


23735 


contig 


23736 


23815 


gap of 


23816 


25556 


contig 


25557 


25636 


gap of 


25637 


26792 


contig 


26793 


26872 


gap of 


26873 


28359 


contig 


28360 


28439 


gap of 


28440 


29898 


contig 


29899 


29978 


gap of 


29979 


31836 


contig 


31837 


31916 


gap of 


31917 


33347 


contig 


33348 


33427 


gap of 


33428 


34568 


contig 


34569 


34648 


gap of 


34649 


35754 


contig 


35755 


35834 


gap of 


35835 


37815 


contig 


37816 


37895 


gap of 


37896 


39641 


contig 


39642 


39721 


gap of 


39722 


41135 


contig 


41136 


41215 


gap of 


41216 


42477 


contig 


42478 


42557 


gap of 


42558 


44229 


contig 


44230 


44309 


gap of 


44310 


45922 


contig 


45923 


46002 


gap of 


46003 


47999 


contig 


48000 


48079 


gap of 


48080 


49982 


contig 


49983 


50062 


gap of 


50063 


51360 


contig 


51361 


51440 


gap of 


51441 


53101 


contig 


53102 


53181 


gap of 


53182 


54926 


contig 


54927 


55006 


gap of 


55007 


56937 


contig 


56938 


57017 


gap of 


57018 


57606 


contig 


57607 


57686 


gap of 


57687 


58632 


contig 


58633 


58712 


gap of 


58713 


60613 


contig 


60614 


60693 


gap of 



f unknown length 
g of 645 bp in length 
I unknown length 
g of 604 bp in length 
f unknown length 
g of 968 bp in length 
f unknown length 
g of 1200 bp in length 
f unknown length 
g of 989 bp in length 
'f unknown length 
g of 844 bp in length 
f unknown length 
g of 810 bp in length 
f unknown length 
g of 723 bp in length 
f unknown length 
g of 1025 bp in length 
f unknown length 
g of 962 bp in length 
f unknown length 
g of 1157 bp in length 
f unknown length 
g of 700 bp in length 
f unknown length 
g of 1829 bp in length 
f unknown length 
g of 1741 bp in length 
f unknown length 
g of 1156 bp in length 
f unknown length 
g of 1487 bp in length 
f unknown length 
g of 1459 bp in length 
f unknown length 
g of 1858 bp in length 
f unknown length 
g of 1431 bp in length 
f unknown length 
g of 1141 bp in length 
f unknown length 
g of 1106 bp in length 
f unknown length 
g of 1981 bp in length 
f unknown length 
g of 1746 bp in length 
f unknown length 
g of 1414 bp in length 
f unknown length 
g of 1262 bp in length 
f unknown length 
g of 1672 bp in length 
f unknown length 
g of 1613 bp in length 
f unknown length 
g of 1997 bp in length 
f unknown length 
g of 1903 bp in length ' 
f unknown length 
g of 1298 bp in length 
f unknown length 
g of 1661 bp in length 
f unknown length 
g of 1745 bp in length 
f unknown length 
g of 1931 bp in length 
f unknown length 
g of 589 bp in length 
f unknown length 
g of 946 bp in length 
f unknown length 
g of 1901 bp in length 
f unknown length 



60694 
62728 
62808 
65312 
65392 
66686 
66766 
68831 
68911 
71104 
71184 
72194 
72274 
74139 
74219 
76237 
76317 
77914 
77994 



82777 
82857 
85683 
85763 
89309 
89389 
94000 
94080 
109467 
109547 
110184 
110264 
110843 
110923 
111637 
111717 
112031 
112111 
112675 
112755 
113302 
113382 
113980 
114060 
114699 
114779 
115188 
115268 
115939 
116019 
116540 



62727 
62807 
65311 
65391 
66685 
66765 
68830 
68910 
71103 
71183 
72193 
72273 
74138 
74218 
76236 
76316 
77913 
77993 



82776 
82856 
85682 
85762 
89308 
89388 
93999 
94079 
109466 
109546 
110183 
110263 
110842 
110922 
111636 
111716 
112030 
112110 
112674 
112754 
113301 
113381 
113979 
114059 
114698 
114778 
115187 
115267 
115938 
116018 
116539 
116619 



Query Match 5,3%; 
Best Local Similarity 41,4%; 
Matches 1027; Conservative 



contig of 2034 bp in length 
gap of unknown length 
contig of 2504 bp in length 
gap of unknown length 
contig of 1294 bp in length 
gap of unknown length ■ 
contig of 2065 bp in length 
gap of unknown length 
contig of 2193 bp in length 
gap of unknown length 
contig of 1010 bp in length 
gap of unknown length 
contig of 1865 bp in length 
gap of unknown length 
contig of 2018 bp in length 
gap of unknown length 
contig of 1597 bp in length 
gap of unknown length 
contig of 2815 bp in length 
gap of unknown length 
contig of 1888 bp in length 
gap of unknown length 
contig of 2826 bp in length 
gap of unknown length 
contig of 3546 bp in length 
gap of unknown length 
contig of 4611 bp in length 
gap of unknown length 
contig of 15387 bp in length 
gap of unknown length 
contig of 637 bp in length 
gap of unknown length 
contig of 579 bp in length 
gap of unknown length 
contig of 714 bp in length 
gap of unknown length 
contig of 314 bp in length 
gap of unknown length 
contig of 564 bp in length 
gap of unknown length 
contig of 547 bp in length 
gap of unknown length 
contig of 598 bp in length 
gap of unknown length 
contig of 639 bp in length 
gap of unknown length 
contig of 409 bp in length 
gap of unknown length 
contig of 671 bp in length 
gap of unknown length 
contig of 521 bp in length 
gap of unknown length 



Score 162.6; DB 55; 
Pred. No. 1.3e-09; 
1; Misiatches 1425; 



Length 161891; 
Indels 26; Gaps 



Qy 411 aagtttgattgtgtacatatatatatatatcttcaaattttataataaaaattgtgttta 470 

III II I II llll I II I I I II Mil I MM 

Db 137610 AAGGATGTTATATTAATTATACCGACATTAAATAATAATTACCMTATITTATTTGTTAT 137669 

Qy 471 aataatttacagttatattatttttttatctctaattttatttgtcgccaaatttttagt 530 

llll llll I I III I II I I II I I 

Db 137670 ATAATATAAATCAAAATATTITAACAACTATATAAATATAAATTATTTATATTTAATCAT 137729 

Qy 531 tgatattttaacataaaaaaaattgtacacatttacaagcccatatacaaataattatat 590 

I II I II I II I I I III I III llll II llll 

Db 137730 AIAMTAAAATAATTTATTTAACTTAATATATTATTTATATAATTTTAAAATTATAATAT 137789 

Qy 591 aaatattcattaaaaaatatatttaaatat---aggatataaatataactattttagaat 647 

II llll III II III llll llllllll II I II II 

Db 137790 AATTATTTTTTATTTTATTATATTACATATTAATITATATAAATTAAATITTTAAATATA 137849 
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* be 


preserved. 




1 67262 




67263 67462 


* 


67463 82485 




32436 82685 




82686 130281 



Hyman,R.W., Fung,E.L,, Qin,F., Rowley, D., Tamaki,T., Kurdi,O.B., 
.Conway, A, B. and Davis, R.W, 
TITLE Plasmodium falciparum 3D7 chromosome 12 
JOURNAL unpublished 
REFERENCE 2 (bases 1 to 130281) 
AUTHORS Hyman,R.W., Qin,F., Fung,E,L., Conway,A.B. and Davis, R.W. 
TITLE Direct Submission 

JOURNAL Submitted (19-FEB-1998) Stanford DNA Sequencing and Technology 

Center, Stanford University, 855 California Avenue, Palo Alto, CA 
94304, USA 

COMMENT On Mar 15, 2000 this sequence version replaced gi: 6652498. 

* NOTE: This is a 'working draft' sequence. It currently 

* consists of 3 contigs, The true order of the pieces 

* is not known and their order in this sequence record is 

* arbitrary, Gaps between the contigs are represented as 

* runs of N, but the exact sizes of the gaps are unknown. 

* This record will be updated with the finished sequence 

* as soon as it is available and the accession number will 

2: contig of 67262 bp in length 
2 : gap of unknown length 
5 : contig of 15023 bp in length 
5: gap of unknown length 
1: contig of 47596 bp in length. 
FEATURES Location/Qualifiers 
source 1, .130281 

/organism-'Plasmodium falciparum" 
/dbjtref-"taxon:5833" 
/chromosome-*12" 
/clone-'3D7" 

BASE COUNT 52250 a 11780 C 11855 g 53996 t 400 others 



Query Match 5.5*; Score 168; DB 60; Length 130281; 

Best Local Similarity 44,2*; Pred, No, 3.8e-10; 

Matches 1146; Conservative 0; Mismatches 1416; Indels 29; Gaps 10; 

Qy 127 tcttgtgttacaatataataaatacatcgtagaaataaattttattcaaattgaagtctt 186 

I II II I II I I Mil I I I Mill II I I I I 

Db 98689 TATTTCGTAAAAAAAATAATAATAAAAATGATATTTAAATATTTATATTAACTACAATAT 98748 

Qy 187 aaccatctttaatatttgtagatgtaatttaaatgaaagataaatacatattcttggaca 246 

i ii ii ii i i i ii ii i i mini iii mi 

Db 98749 TAATATTTTATTATTTATAAATTATTTATTTAAAAATATATAAATATTAATTTATGGAAT 98808 

Qy 247 tgtattttcatcttaatgtttgtggctttggtgataggtgtattgatgtacgatgtcttt 306 

II II llll I II I I I I I llll I II II I III 

Db 98809 TTTTAAATTAATTTAAATTAATTGTTTATATTTAATTATATATTTA-ATAAAATATATTT 98867 

Qy 307 taaatcacatatcacattttgagtttgtatgatgataagtcgacataancgaaatatggt 366 

llll I llll llll I II I I III I III I 
Db 98868 TAAAATAATACACACAAAATGATTCTTAATTAAATATAAAATATTTATTTTATTTATAAT 98927 

Qy 367 gtgatcttcacttttgaactttgataagtcaccaaactttaacaaagtttgattgtgtac 426 

I II I III II I I II III II III I I II 

Db 98928 GAAATAAATAATTTAATTTAAAATAAAATAAAATAAAAATAATAATATTTATTATTATAT 98987 

Qy 427 atatatatatatatcttcaaattttataataaaaattgtgtttaaataatttacagttat 486 

I 1 1 i 1 1 1 II I II Mill II II I II II II I I I I II 
Db 98988 A-ATATATTTAATTAAATTAAATTTATTATTTAATTAATTTAAAATAAAATAAAAATAAT 99046 

Qy 487 attatttttttatctctaattttatttgtcgccaaatttttagttgatattttaacataa 546 

I lllll II I I llll llll I llll I II II I I 
Db 99047 AATATTTATTATTATATAATATATTTAATTAAATTAAATTTATTATTTAATTAATTAATT 99106 

Qy 547 aaaaaattgtacacatttacaagcccatatacaaataattatataaatattcattaaaaa 606 

llll I I I I II II I I lllll I I llllll 
Db 99107 TAAAATAAAATAAAAATAATAATATTTATTATTATATAATATATTTAATTAAATTAAATT 99166 

Qy 607 atatatttaaatataggatataaatataactattttagaattattctactttaagataac 666 

Mill II II llll II II I lllll I II III II I ' 
Db 99167 TATTATTTAATTA — ATTTAAAATAAATAATAATTAAATTAATATATATTATAATTAA 99222 



Qy 667 ataggttaaatgtataattaataaggttagtttattgtaaagatgagtatatatgtcgta 726 

II II II III I I II I III I I II III I I 
Db 99223 TTAAAAATAAAATAATATTCAATTATAAATTTATTAATAATTTTAAATAAATAATTAAAA 99282 

Qy 727 aacataatcactaaccatttttattaacttcttggttttgaagttccaaaaagaaaatgg 786 

I I lllll III lllll I llll III llll 

Db 99283 TAATAATAAATTAAATAATTTAATTAA TATAAATAAACATTATAAATTAAAAAAT 99337 

Qy 787 aagggaaatttgagagtaagttcatgtttatattatacataatgaagttgatgttttctt 846 

II II III I I I II III I II I I 

Db 99338 AATTTAATAAATATATATATAATTATTAATAAATTTAAATACATATATTTTTAATAATAA 99397 

Qy 847 ctttttaatatttttatacaaaatatttaaataaaataattaaggattgaatgaaaaata 906 

II lllll I III II llll llll I II I II I I I 
Db 99398 TTTCTTAATTTATTTTATTACATTATTATAATATATATTTTTTATTTAAAAAAATATAAT 99457 

Qy 907 taatgaaagtcgttttactaatagtcatattgcattttgtcgcatctacttaaataatag 966 

III III I III I I II lllll III I llll 

Db 99458 TAAATAAATTAATTTAATAATTAAATATATTTAATAGTAATTAAATATTAAACAAATAAT 99517 

Qy 967 ataaattaattgtggtacattagatcaaagaacaaactagattttgtcccattctattgt 1026 
llllll I I II I II III I II llll I III 

Db 99518 ATAAATATTATATAATATTATTAATTAAATTAAAATAATAATTTATTTTATAATTATATA 99577 

Qy 1027 taaaagctggtccgtttacattaaaataaggtacatgttacatgccacgtataactatct 1086 

II I I I I II llll II II I 
Db 99578 TATATATTAATAATTAATTTAAAATATAAATTAAGAAAGATAATTTTATACTTTTATTTA 99637 

Qy 1087 ggttattctatcaatcacgctaatttttaacagtagaaatgaatgtaatttttaaataga 1146 
I I I III lllllll II III llll I I I I ,•• 

Db 99638 ATTAAATATATAGTAATAAATAATTTTATGTTATTTATTATAATAATATTTATTATTTTA 99697 

Qy 1147 aagggtcaaattgttatttgatctaacacgtagggattaatttacttattttcctaaaga 1206 

I I llll I II I II II II III II III 
Db 99698 TTTTATTTATTTAATAAATAATTAATTTTATAAAATATATTTATTTTAAATTAAAATATA 99757 

Qy 1207 aataagtaaaatataatttgaatcttaatacaaaaactttcatgatacttttatcatatt 1266 

II I III II I II III I III I III II I II lllll -r 

Db 99758 AACATATAATTAATTAATTAAATATATATATATTTTHTTAATATAATAAATAATATATT 99817 

Qy 1267 ttacttataatttaatattgtgagagtaacaaartta aaaaacatagaaacacc 1320 

llllll III! II llllll II III I I I 
Db 99818 TCACATTTTAATTAAAATAAAAATAACCATTTATTAATTAACTTAATTAATATATAAAAT 99877 

Qy 1321 aaaagttagttatggtgtgactcatatacacagttaaaatttgaataaatttttttcttc 1380 

llll I II llll I I I II I I II llll I I II II I 
Db 99878 AAAATAATTTAATTGTGTAATTAAATTAAATATAAAACATTTATTAATAATTAATTATAT 99937 

Qy 1381 gtcattaattccatcatgggtttttttttttctagttaagccataattatcaaaataatc 1440 

III I II II II II I II I II I II II I I 

Db 99938 ATAATATTATATATTATCTTAAATTAATTAATTTTTTTAATTATTTTAATATAATAATTA 99997 

Qy 1441 atcattaatcctatcaataccccgccctgcctccctccctcaatacttaaacccaactaa 1500 

I llllll II II II I I 

Db 99998 TTTATTAATATTAATTATGTATATTTTATTTAATTGTTTAATATTTATTATTATTTTATT 100057 

Qy 1501 cacccagcaccaaacgcactttaatagccacctatttctagccatgtccttgcacttaaa 1560 

I II I II I II III I II II II III 

Db 100058 TAATATATTATTAATTTAATTAATTATTATATTATATTTAATTATTATATTATTTATAAT 100117 

Qy 1561 gaaaagtaaagctaacctgcaatcattccatatcgaggcctcaacagataaagttggttg 1620 

I I II III I III III I I II I II 

Db 100118 ATATTATTAATTTAATTATTGTTTATTTATTATATTATATATTATTATTTAA-TTATTA 100175 

Qy 1621 atgggtttgcaccaagttgttaaaacccggccctcaacttcccttttcttttcatcctcc 1680 

I III llll II I II I I II 

Db 100176 TTTTATTTATTATATTATATATTATTAATTTATATAATATAATTTAATTATATATATATT 100235 

Qy 1681 ccactccacaccctccaattttcttcatatggttctattataagttctttataatcacag 1740 

II I I III I llll II I I II II II 

Db 100236 TAATTTATTTATATATTAATTTAATTATATATTTATTTAATTTAATTATATATTTTATTT 100295 
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Oy 2054 cctacaaacatgtcattacaatgtttaattataaattccattcttctattttactaagat 2113 

I I! Ill III I I Mil I III II I I 
Db 7745 ATTTATAATTAATTTATAATTAATTTTAATTTAAAATATATTTTTAT T 7698 

Qy 2114 attagtaacttcaaactgctgatttttactaatttattatttataaattgttagaatgat 2173 

mi iii ii ii mm minim n in i n n 

Db 7697 ATTATTAAATTAAATGGTAATAAATGAAATTATTTATTATTAATTTATTTTAAGTTAAAT 7638 
Qy 2174 tatttttcaataatttaacaacaatatttaatattattattattattatttctcaatttt 2233 

iii 1 1 i ii ii i nun ii i ii i ii ii mi 

Db 7637 TATATATGTAATATATATATAATTTATTTATTAAATTAATAAACATATATATTTTCTTTT 7578 

Qy 2234 tattaaacaaaaacataaatttttgacaaattaaaataaatgaattaatttctcaatttt 2293 

I I II III III II II I I I II II I I I I III 
Db 7577 TTATTAAATTAAATATATATATTATATATTAAATAAATTATAATATTAATATATAAAATA 7518 

Qy 2294 tcgtgcaactattacaaaaatccttcatagtcctaatcttaatttgatgcagaggtgata 2353 

I I II I I I II I I I II llll I II I 
Db 7517 ACAGTGTAATAATTAATATATTATAATATATTTTTATATTAAAATAATAATAAATAATAT 7458 

Qy 2354 ataatcttaatttgatgcagaggtaataatgggccgggtttgagctggacttaagcatga 2413 

III II I I II Mil II I II I I 

Db 7457 TATATATATATATATTATATAATTAATTATTAAATATTAATTTTTGTTTTATATTAAATA 7398 

Qy 2414 tattgacgtactttatatttttccaaattcaacccagctcgaaatatgagtctaaaattt 2473 

llll I II llll III I I I I I III I II II I 
Db 7397 TATTATTATTTTTAATATATTTAATATATTTAATTAATTACCTATAATAAATTATTATAT 7338 

Qy 2474 tgtccaatttaatccaagcccattttaagttcgtccatattattttttaatttaaaaaat 2533 

m i iiiii i iiii i ii iii i mm i i 

Db 7337 TGTTTATATTAATTAATTTAATAATAAATATAAAATAAATAATTATATAATTTTATTTAA 7278 

Qy 2534 ttatatcattttattttaatatttaatt— attttatatattttttatttattgaaaat 2590 

llll I I IIIII III I IIIII lllll IIIII I I 
Db 7277 GTATAATTAAATTMTAAATATATAAATAAAAATTAATTAATATATAAATTATTTTATAC 7218 

Qy 2591 ttttatatagtcatcttaacattatgttaatgtttatattagagtagtattatatatatt 2650 

I llll I I llll I IIIII MINI I III II I 
Db 7217 ATAATTATAATTAATTTAATTATTATTTAATTAATATATTATATATTTATGCATATTAAT 7158 

Qy 2651 tagtataggtttattttgttaataaacttaaaaatgggtcttgtgggotagacttggacc 2710 

I llll I llll II llll III I llll I llll I 
Db 7157 TTATATATAACTTTTTTTATATTAAATAAAAATAATTTTCTTTT ACTTCTAAT 7105 

Qy 2711 ttaaatgctcaaactcaaacttaattcatattttaaacaggcttaatatttttatttaca 2770 

IIIII I IIIII II I II I II llll Mill! I 

Db 7104 ATAAATATATGTTATTTTCCTTAAGATATTTATTTTCAATAAATATIATTATTATTTAAA 7045 

Qy 2771 ctgtttcaaatttttcgggtgaaa 2794 

I II I I I llll 
Db 7044 TAAATATAATTATATAATATGATA 7021 



RESULT 12 
AC005504/C 
LOCOS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 
ORGANISM 



TITLE 
JOURNAL 



AUTHORS 
TITLE 



AC005504 104992 bp DNA HTG 01-APR-1999 

Plasmodium falciparum chromosome 12, *** SEQUENCING IN PROGRESS 
***, 3 unordered pieces. 
AC005504 

AC005504.3 GZ:455B584 

HTG; HTGS_PHASE1 . 

malaria parasite P. falciparum. 

Plasmodium falciparum 

Eukaryota; Alveolata; Apicomplexa; Haemosporida; Plasmodium, 

1 (bases 1 to 104992) 

Hyman,R,W,, Fung,E.L., Qin,F,, Tamaki,T,, Kurdi,0,B., Conway, A, B, 
and Davis, R.W. 

Plasmodium falciparum 3D7 chromosome 12 
Unpublished 

2 (bases 1 to 104992) 

Hyman,R.W,, Qin,F., Fung,E.L,, Conway, A. B. and Davis, R,W. 
Direct Submission 



JOURNAL Submitted (21-AUG-1998) Stanford DNA Sequencing and Technology 

Center, Stanford University, 855 California Avenue, Palo Alto, CA 
94304, USA 

On Apr 2, 1999 this sequence version replaced gi: 4337172, 

* NOTE: This is a 'working draft' sequence. It currently 

* consists of 3 contigs, The true order of the pieces 

* is not known and their order in this sequence record is 

* arbitrary. Gaps between the contigs are represented as 

* runs of N, but the exact sizes of the gaps are unknown. 

* This record will be updated with the finished sequence 

* as soon as it is available and the accession number will 

* be preserved, 

* 1 58642: contig of 58642 bp in length 

* 58643 58842: gap of unknown length 

* 58843 91011: contig of 32169 bp in length 

* 91012 91211: gap of unknown length 

* 91212 104992: contig of 13781 bp in length. 
Location/Qualifiers 

source 1. .104992 

/organism- "Plasmodium falciparum" 

/dbjtref-"taxon:5833" 

/chromosome-"12" 
BASE COUNT 44286 a 9326 c 9564 g 41411 t 405 others 
ORIGIN 



Query Match 5.5%; Score 168; DB 41; Length 104992; 

Best Local Similarity 44.24; Pred. No. 4e-10; 

Matches 1146; Conservative 0; Mismatches 1416; Indels 29; Gaps 10; 

Qy 127 tcttgtgttacaatataataaatacatcgtagaaataaattttattcaaattgaagtctt 186 
III II I II I I II! I I I IIIII II I I I I ' 
Db 74983 TATTTCGTAAAAAAAATAATAATAAAAATGATATTTAAATATTTATATTAACTACAATAT 74924 

Qy 187 aaccatctttaatatttgtagatgtaatttaaatgaaagataaatacatattcttggaca 246 

I II II II I I I II II I I lllllll III llll - 

Db 74923 TAATATTTTATTATTTATAAATTATTTATTTAAAAATATATAAATATTAATTTATGGAAT 74864 

Qy 247 tgtattttcatcttaatgtttgtggctttggtgataggtgtattgatgtacgatgtcttt 306 
II I I llll I II I I I I I llll I II II I III V 
Db 74863 TTTTAAATTAATTTAAATTAATTGTTTATATTTAATTATATATTTA-ATAAAATATATTT 74805 

Qy 307 taaatcacatatcacattttgagtttgtatgatgataagtcgacataancgaaatatggt 366 

llll I llll III I I II I I I II I III I 

Db 74804 TAAAATAATACACACAAAATGATTCTTAATTAAATATAAAATATTTATTTTATTTATAAT 74745 

Qy 367 gtgatcttcacttttgaactttgataagtcaccaaactttaacaaagtttgattgtgtac 426 

I II I III llll II III II III I I II 

Db 74744 GAAATAAATAATTTAATTTAAAATAAAATAAAATAAAAATAATAATATTTATTATTATAT 74685 

Qy 427 atatatatatatatcttcaaattttataataaaaattgtgtttaaataatttacagttat 486 

I mill II I II IIIII II III II II II I I I I II 
Db 74684 A-ATATATTTAATTAAATTAAATTTATTATTTAATTAATTTAAAATAAAATAAAAATAAT 74626 

Qy 487 attatttttttatctctaattttatttgtcgccaaatttttagttgatattttaacataa 546 

I IIIII II I I llll I II I I llll I II II I I 

Db 74625 AATATTTATTATTATATAATATATTTAATTAAATTAAATTTATTATTTAATTAATTAATT 74566 

Qy 547 aaaaaattgtacacatttacaagcccatatacaaataattatataaatattcattaaaaa 606 

llll I I I Ml II I I IIIII I I mm 
Db 74565 TAAAATAAAATAAAAATAATAATATTTATTATTATATAATATATTTAATTAAATTAAATT 74506 

Qy 607 atatatttaaatataggatataaatataactattttagaattattctactttaagataac 666 

mini ii ii mi ii ii i iiiii i ii iii ii i 

Db 74505 TATTATTTAATTA — ATTTAAAATAAATAATAATTAAATTAATATATATTATAATTAA 74450 

Qy 667 ataggttaaatgtataattaataaggttagtttattgtaaagatgagtatatatgtcgta 726 

II II II III I I III III I I II III I I 

Db 74449 TTAAAAATAAAATAATATTCAATTATAAATTTATTAATAATTTTAAATAAATAATTAAAA 74390 

Qy 727 aacataatcactaaccatttttattaacttcttggttttgaagttccaaaaagaaaatgg 786 

I I I III I III IIIII I llll III llll 

Db 74389 TAATAATAAATTAAATAATTTAATTAA TATAAATAAACATTATAAATTAAAAAAT 74335 
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I III! I I I I III I II II I I 

Db 17600 TTAAAAAATAAACAAAAAATTTTTAATAAATAAATTTTATAATGAAATATAATTTATTTA 17659 

Qy 1510 ccaaacgcactttaatagccacctatttctagccatgtccttgcacttaaagaaaagtaa 1569 

I II III I I II I II II Mill MM I 
Db 17660 TTTTTCATTTTTAAAAAAAAATTTTTTAAAAAAAATAATTTTTTTTTTAAAAAAAMCTA 17719 

Qy 1570 agctaacctgcaatcattccatatcgaggcctcaacagataaagttggttgatgggtttg 1629 

I I I III I I I I II II I II II 
Db 17720 TATACTAATTATAAATTAATAGATATTTATATATATATAAATATTTAATATATTATTATA 17779 

Qy 1630 caccaagttgttaaaacccggccctcaacttcccttttcttttcatcctccccactccac 1689 

I I I I II III I I I I II II 

Db 17780 TATCTAATAATTTAAATAAAAAATTTTAAAATTTAAAAATGTAGATATAATTTATAAAAA 17839 

Qy 1690 accctccaattttcttcatatggttctattataagttctttataatcacagaatcaagat 1749 

I I I II III I II I Ml II IMI! I I III 
Db 17840 TTTATATTCTCATATTTATTTATTATTAATTTAA--TTTATATAAATAATATAATAATTT 17897 

Qy 1750 aagtcctcagcaaacaaaaaaccatggctctcgagcaagatctggactagtcagagctct 1809 

II I I I I I I I II I I II I I MM 

Db 17898 AATTAATTATTATATATTTATAAATTTATATATTATTGAATATTTATATAATATATATAT 17957 

Qy 1810 gaatattggatcattattacagtcaaaaacagttaacaaaagctgttgcagataaacact 1869 

Ml I I Ml MINI II Ml I II I II 
Db 17958 ATATATAGAAAAATTAAATTATTTAAATAATTTAATATAAATTTTTTAAAAATTTCTTAA 18017 

Qy 1870 gaatctgct atagtttgtttttggtttacatatgttccacgtgaaactatga 1921 

III II II I I I I I Mil III I I 
Db 18018 ATGTATTATTTTTATAAAAAATATTTATATAATAAAATCATGTTTTTTAAAAAATAAACA 18077 

Qy 1922 agcatctctaagaaaacccaaactatcatatcaacccatcgatcaatgaatcgattt— 1978 

I I II I III I III II I II II II II Ml 
Db 18078 AAAAATTTTTAATAAATAAATTTTATAATGAAATATAATTTATTTATTTTTCAATTTTTT 18137 

Qy 1979 -caattttcgcagtataagttccttttaatcctttctttttacttcattttataacgaa 2036 

II II III I Ml I I III II II II 

Db 18138 TAAAAAATTTTTTAAAAAAAATAATTTTTTTTTTAAAAAAAAACTATATACTAATTATAA 18197 

Qy 2037 ttctatggataatgttccctacaaacatgtcattacaatgtttaattataaattccattc 2096 

I II Ml I I II I I II I II Ml III I I II 
Db 18198 ATTAATAGATATTTATATATATATAAATATTTAATATATTATTATATATCTAATAATTTA 18257 

Qy 2097 ttctattttactaagatattagtaacttcaaactgctgatttttactaatttattattta 2156 

I I I I III II I I I I I II II II 
Db 18258 AATAAAAAATTTTAAAATTTAAAAATGTAGATATAATTTATAAAAATTTATATTCTCATA 18317 

Qy 2157 taaattgttagaatgattatttttcaataatttaacaacaatatttaatattattattat 2216 

I III I III III I III III II I Mllll 

Db 18318 TTTATTTATTATTAATTTAATTTATATAAATAATATAATGATTTAATTAATTATTATATA 18377 

Qy 2217 tattatttctcaatttttattaaacaaaaacataaatttttgacaaattaaaataaatga 2276 

II Ml Mil II I I Ml II I III II Ml I 

Db 18378 TTTATAAATTTATATATTATTGAATATTTATATAATATATATATATATATAGAAAAATTA 18437 

Qy 2277 attaatttctcaatttttcgtgcaactattacaaaaatccttcatagtcctaatcttaat 2336 

I I Ml I III II I II Ml I III I II II I III 

Db 18438 AATT ATTT AAATAATTTAATATAAATTTTTTAAAAATTTCTT AAATGTATT ATTTTTA - ■ 18495 

Qy 2337 ttgatgcagaggtgataataatcttaatttgatgcagaggtaataatgggccgggtttga 2396 

I I Mllll I I II I I I III I II 

Db 18496 TAAAAAATATTTATATAATAAAATCATGTTTTTTAAAAAATAAACAAAAAATTTTTAATA 18555 

Qy 2397 gctggacttaagcatgatattgacgtactttatatttttccaaattcaacccagctcgaa 2456 

I I II I MM II I I MM Ml II II II I 

Db 18556 AATAAATTTTATAATGAAATATAATTTATTTATTTTTCAATTTTTTTAAAAAATTTTTTA 18615 

Qy 2457 atatgagtctaaaattttgtccaatttaatccaagcccattttaagttcgtccatattat 2516 

I I I I Ml Mill I I I II I I I I ' 

Db 18616 AAMAAATAATTTTTTTTTTAAAAAAAAACTATATACTAATTATAAATTAATAGATATTT 18675 

Qy 2517 tttttaatttaaaaaatttatatcattttattttaatatttaattattttatatattttt 2576 

I I II Ml I II III! I Ml I II I I III I I I MM 



Db 18676 ATATATATATAAATATTTAATATATTATTATATATCTAATAATTTAAATAAAAAATTTTA 18735 

Qy 2577 tatttattgaaaatttttatatagtcatcttaacattatgttaatgtttatattagagta 2636 
Ml II I MM I I III I II MM I I II 

Db 18736 AAATTTAAAAATGTAGATATAATTTATAAAAATTTATATTCTCATATTTATTTATTATTA 18795 

Qy 2637 gtattatatatatttagtataggtttattttgttaataaacttaaaaatgggtcttgtgg 2596 

I I II Mil I III I Ml MM I III I II I I 

Db 18796 ATTTAATTTATATAAATAATATAATGATTTAATTAATTA--TTATATATTTATAAATTTA 18853 

Qy 2697 gctagacttggaccttaaatgctcaaactcaaacttaattcatattttaaacaggcttaa 2756 

II III I II I II I I II I Ml II I I 

Db 18854 TATATTATTGAATATTTATATAATATATATATATATATAGAAAAATTAAATTATTTAAAT 18913 

Qy 2757 tatttttatttacactgtttcaaatttttc'gggtgaaatatcttcgagtctagattaata 2816 

Ml II III MM I II I III II I I 

Db 18914 AATTTAATATAAATTTTTTAAAAATTTCTTAAATGTATTATTTTTATAAAAMTATTTAT 18973 

Qy 2817 acaccacaggtctaatttgatgctcaatgaaaatgaaatcatattgagcttaattaatat 2876 

I I I I I III I I I II II I II I I I II III 
Db 18974 ATAATAAAATCATTTTTTTTTAAAAAIAAACAAAAAATTTTTAATAAATAAATTTTATAA 19033 

Qy 2877 tccattcttctttgctga 2894 

I I I I III I I 
Db 19034 TGAAATATAATTTATTTA 19051 
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AE001398/C 
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VERSION 
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TITLE 
JOURNAL 



AUTHORS 

TITLE 

JOURNAL 

FEATURES 
source 



AE001398 14867 bp DNA INV 06-NOV-1998 

Plasmodium falciparum chromosome 2, section 35 of 73 of the* 
complete sequence. 
AE001398 AE001362 
AE001398.1 61:3845197 

malaria parasite P. falciparum. 
Plasmodium falciparum 

Eukaryota; Alveolate; Apicomplexa; Haemosporida; Plasmodium.. 

1 (bases 1 to 14867) 

Gardner, M. J., Tettelin,H,, Carucci,D,J., Cummings,L.M,, Arayind,L,, 
Koonin,E.V,, Shallom,S,, Mason, T., Yu,K., Fujii,C, PedersonVJ., 
Shen,K,, Jing,J,, Aston, C, Lai,Z., Schwartz, D.C., Pertea,M_:-, 
Salzberg,S., Zhou,L., Sutton, G.G., Clayton, R., White, O., 
Smith,H.O., Fraser,C.M., Adams, M.D., venter, J. C. and Hoffman, S.L. 
Chromosome 2 sequence of the human malaria parasite Plasmodium 
falciparum 

Science 282 (5391), 1126-1132 (1998) 
99021743 

Erratum: ([published erratum appears in Science 1998 Dec 
4;282(5395):1827]] 

2 (bases 1 to 14867) 
Gardner, M, J. 

Direct Submission 

Submitted (02-NOV-1998) The Institute for Genomic Research, 9712 
Medical Center Drive, Rockville, MD 20814, USA 

Location/Qualifiers 

1. .14867 

/organism-'Plasmodium falciparum" 

/db_xref-"taxon:5833" 

/chromosome- "2" 

complement(1570. .2424) 

/gene- "PFB04 90c" 

complement(1570. .2424) 

/gene-"PFB0490c" 

/note- "predicted by GlimmerM" 

/codon_start-l 

/product- "hypothetical protein" 
/protein_id-"AAC71887.1" 
/dbjtref-'GI: 3845198" 

/translation-" MKEKNEKIMDYLSCPLDDVVDREKKSGKNSLLKSSSTKKSDYKK 
SSIFSKKRDSHKKGSSFRGRRSGFINRKSGSFKKPYYNNRLINKNYNNYKGRNFHNGR 
DNFKGRTGSFGSRVFDNRKGSFKKRFISNRNKSSVKSYRGNGSNKMGRKSFNKAPTSR 
• TWTKRLNNYKTVSAPVKKFNNLNISLYRKNRTFALNTKRSKPVGTIKSSVPRKRIKK 
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REFERENCE 1 (bases 12511 to 12682) 
AUTHORS clary, D.O., Goddard, J.M. , Martin, s.C, Fauron,C.M, and 

Wolstenholme,D.R. 
TITLE Drosophila mitochondrial DNA: a novel gene order 
JOURNAL Nucleic Acids Res. 10 (21), 6619-6637 (1982) 
MEDLINE 83090428 
REFERENCE 2 (bases 5269 to 5695) 
AUTHORS Clary, D.O., Wahleithner,J.A, and Wolstenholme,D.R. 
TITLE Transfer RNA genes in Drosophila mitochondrial DNA: related 5' 

flanking sequences and comparisons to mammalian mitochondrial tRNA 
genes 

JOURNAL Nucleic Acids Res. 11 (8), 2411-2425 (1983) 
MEDLINE 83220794 
REFERENCE 3 (bases 404 to 5272) ' ' 

AUTHORS de Bruijn,M.H. 

TITLE Drosophila melanogaster mitochondrial DNA, a novel organization and 
genetic code 

JOURNAL Nature 304 (5923), 234-241 (1983) 

MEDLINE 83245048 
REFERENCE 4 (bases 804 to 1778) 

AUTHORS Satta,Y. ( Ishiva,H. and Chigusa,S.I. 

TITLE Analysis of nucleotide substitutions of mitochondrial DNAs in 
Drosophila melanogaster and its sibling species 

JOURNAL Mol. Biol. Evol. 4 (6), 638-650 (1987) 

MEDLINE 88174373 
REFERENCE 5 (bases 5268 to 13619) 

AUTHORS Garesse,R. 

TITLE Drosophila melanogaster mitochondrial DNA: gene organization and 

evolutionary considerations 
JOURNAL Genetics 118 (4), 649-663 (1988) 
MEDLINE 88212147 
REFERENCE 6 (bases 441 to 2967) 
AUTHORS Sattaj. and Takahata,N. 

TITLE Evolution of Drosophila mitochondrial DNA and the history of the 

melanogaster subgroup 
JOURNAL Proc. Natl. Acad. Sci. U.S.A. 87 (24), 9558-9562 (1990) 
MEDLINE 91088557 
REFERENCE 7 (bases 14215 to 14512) 
AUTHORS Ballard, J.W., 01sen,G.J., Faith,D.P., 0dgers,W.A., Rowell,D.M. and 
Atkinson, P.W, 

TITLE Evidence from 12S ribosomal RNA sequences that onychophorans are 
modified arthropods 

JOURNAL Science 258 (5086), 1345-1348 (1992) 

MEDLINE 93088057 
REFERENCE 8 (bases 14917 to 19517) 

AUTHORS Lewis, D.L., Farr,C.L., Farquhar,A.L. and Raguni,L.S. 

TITLE Sequence, organization, and evolution of the A+T region of 
Drosophila melanogaster mitochondrial DNA 

JOURNAL Mol. Biol. Evol. 11 (3), 523-538 (1994) 

MEDLINE 94285822 
REFERENCE 9 (bases 1 to 408; 13319 to 19517) 

AUTHORS Lewis, D.L., Farr,C.L. and Raguni,L.S. 

TITLE Drosophila melanogaster mitochondrial DNA: completion of the 
nucleotide sequence and evolutionary comparisons 

JOURNAL Insect Mol. Biol. 4 (4), 263-278 (1995) 

MEDLINE 96423163 
REFERENCE 10 (bases 1 to 19517) 

AUTHORS Lewis, D.L., Farr,C.L. and Raguni,L,S. 

TITLE Direct Submission 

JOURNAL Submitted (03-OCT-1995) Laurie S, Kaguni, Biochemistry Department, 
Michigan State University, East Lansing, MI 48824-1319, USA 
FEATURES Location/Qualifiers 
source 1. .19517 

/organism- "Drosophila melanogaster" 

/organelle-'mitochondrion" 

/dbjcref-"taxon:7227" 

/note- "derived from new and previously submitted 
sequences; sequence is a composite containing sequences 
obtained from different Drosophila melanogaster strains" 
tRNA 1. ,65 

/gene-"mt:ND6" 

/product-"tRNA-ile" 

/dbjcref-"FlyBase:FBgn0013685" 



gene 1. .19517 

/gene-"mt:ND6" 

/note- "mitochondrial NADH-ubiquinone oxidoreductase chain 
6" 

/allele-"" 

/db_xref-"FlyBase:FBgn0013685" 
tRNA complement (97, .165) 

/product- "tRNA-Gln" 
tRNA 171. .239 

/gene-"mt:ND6" 

/product-"tRNA-Phe" 

/dbjcref-"FlyBase:FBgn0013685" 
CDS 240. .1265 

/gene-"mt:ND6" 

/codon_start-l ' 

/db_xref-"FlyBase:FBgn0013685" 

/transl_table-5 

/product- "nadh dehydrogenase subunit 2" 

/protein_id-"AAC47811.1" 

/dbjcref-"GI:1166530" 

/translation-'MFNNSSKILFITIMIIGTLITVTSNSWLGAWMGLEINLLSFIPL 
LSDNNNLMSTEASLKYFLTQVLASTVLLFSSILLMLKNNMNNEINESFTSMIIMSALL 
LKSGAAPFHFWFPNMMEGLTWMNALMLMTWQKIAPLMLISYLNIKYLLLISVILSVII 
GAIGGLNOTSLRKLMAFSSINHLGWMLSSLMISESIWLILFFFYSFLSFVLTFMFNIF 
KLFHLNQLFSWFVNSKILKFTLFMNFLSLGGLPPFLGFLPKWLVIQQLTLCNQYFMLT 
IMMMSTLITLFFYLRICYSAFMMNYFENNWIMKMNMNSINYNMYMIMTFFSIFGLFLI 
SLFYFMF" 

tRNA 1264. .1329 

/gene-"mt:ND6" 
/product-"tRNA-Trp" 
/db_xref-"FlyBase:FBgn0013685" 

tRNA , complement 1322. .1383) 

/product-'tRNA-Cys" 

tRNA complement(1403. ,1468) 

/product-"tRNA-Tyr" 

CDS join(1470, .1472,1474. .3009) 

/codon_start-l 

/exception- "mechanism underlying reading frame shift after 
first codon uncertain" 
/trans l_table-5 

/product- "cytochrome c oxidase subunit I" "~ 

/protein.id-"AAC47812.2" 

/dbjtref-"GI:7412849" 

/translation- "MSRQMLFSTNHKDIGTLYFIFGAWAGMVGTSLSILIRAELGHPG 
ALIGDDQIYNVIVTAHAFIMIFFMVMPIMIGGFGNWLVPLMLGAPDMAFPRMNNMSFW 
LLPPALSLLLVSSMVENGAGTGWTVYPPLSAGIAHGGASVDLAIFSLHLAGISSILGA 
VNFITTVINMRSTGISLDRMPLFVWSWITALLLLLSLPVLAGAITMLLTDRNLNTSF 
FDPAGGGDPILYQHLFWFFGHPEVYILILPGFGMISHIISQESGKKETFGSLGMIYAM 
LAIGLLGFIVWAHHMFTVGMDVDTRAYFTSATMIIAVPTGIRIFSWLATLHGTQLSYS 
PAILWALGFVFLFTVGGLTGWLANSSVDIILHDTYYWAHFHYVLSMGAVFAIMAGF 
IHWYPLFTGLTLNNKWLKSHFIIMFIGVNLTFFPQHFLGLAGMPRRYSDYPDAYTTWN 
IVSTIGSTISLLGILFFFFIIWESLVSQRQVIYPIQLNSSIEWYQNTPPAEHSYSELP 
LLTN" 

tRNA 3012. .3077 

/gene-"mt:ND6" 

/product-"tRNA-Leu" 

/db.xref-"FlyBase:FBgn0013685" 
CDS 3083. .3767 

/note-"TAA stop codon is completed by the addition of 3' A 

residues to the mRNA" 

/codon_start-l 

/transl.except- (pos : 3767 , aa : TERM) 
/transl_table-5 

/product-'cytochrome c oxidase subunit II" 
/protein id- "AAC47 813.1" 
/db_xref-"GI: 1166532" 

/translation-'MSTWANLGLQDSASPLMEQLIFFHDHALLILVMITVLVGYLMFM 
LFFNNYVNRFLLHGQLIEMIWTILPAIILLFIALPSLRLLYLLDEINEPSVTLRSIGH 
QWYWSYEYSDFNNIEFDSYMIPTNELMTDGFRLLDVDNRWLPMNSQIRILVTAADVI 
HSWTVPALGVKVDGTPGRLNQTNFFINRPGLFYGQCSEICGANHSFMPIVIESVPVNY 
FIRWISSNNS" 
tRNA 3768. .3838 

/gene-"mt:ND6" 
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REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



FEATURES 

source 



gene 



repeatjinit 



repeatjinit 



repeat_unit 



repeat_unit 



repeat_unit 



miscjeature 



repeat_unit 



repeat_unit 



repeat_unit 



repeat_unit 



repeatjinit 



94285822 

2 (bases 1 to 4601) 
Kaguni/L.S. 
Direct Submission 

Submitted (28-JUN-1994) Laurie S. Kaguni Ph.D, Dept. of 
Biochemistry, Michigan State University, East Lansing, MI, 
48824-1318, USA 

Location/Qualifiers 

1. .4601 

/organism- "Drosophila melanogaster" 

/organelle-'mitochondrion" 

/strain-'Oregon-R" 

/dh_xref- • taxon : 7 227 * 

/dev_stage-" embryo" 

1. .4601 

/gene-"mt:ori" 

/note- "mitochondrial origin" 
/allele-"" 

/dbjcref-"FlyBase:FBgn0013687" 
650. ,1022 
/gene="mt:ori" 
/note-'repeat I-A" 
/dbjtref-'FlyBase:FBgn0013687" 
/rpt_type-tandem 
1023. .1360 
/gene-"mt:ori" 
/note-"repeat I-Bl" 
/db_xref-"FlyBase:FBgn0013687" 
/rpt_type-tandem 
1361. .1705 
/gene-"mt:ori" 
/note-'repeat l-C/A' 
/db_xref-"FlyBase:FBgn0013687" 
/rpt_type-tandem 
1706. .2043 
/gene-"mt:ori" 
/note-'repeat I-B2" 
/dbjcref-"FlyBase:FBgn0Q13687" 
/rpt_type-tandem 
2044. .2388 
/gene-"mt:ori" 
/note-'repeat I-C" 
/dbjcref-"FlyBase:FBgn0013687" 
/rpt.type-tanden 
2491. .2511 
/gene-"mt:ori" 

/note- " deoxy thymidy late stretch" 
/dbjcref-"FlyBase:FBgn0013687" 
2512. .2648 
/partial 
/gene-"mt:ori* 

/dbjcref-"FlyBase:FBgn0013687" 
/rpt_type-tandem 
2649. .3112 
/gene-"mt:ori" 
/note-'repeat II-A" 
/db_xref-"FlyBase:FBgn0013687" 
/rpt_type-tandem 
3113. .3576 
/gene-"mt:ori" 
/note-'repeat n-Bl" 
/dbjtref- "FlyBase : FBgn0013687 " 
/rpt_type-tandem 
3577. ,4040 
/gene-"mt:ori" 
/note-'repeat II-B2" 
/dbjcref-"FlyBase:FBgn0013687" 
/rpt.type-tandem 
4041. ,4504 
/gene-"mt:ori" 
/note-'repeat II*C 
/dbjtref- "FlyBase : FBgn0013687 " 
/rpt_type-tandem 



BASE COUNT 
ORIGIN 



:ure complement (4565, ,4585) 

/note-"deoxythpidylate stretch" 
2271 a 131 c 74 g 2125 t 



Query Match 5.74; Score 172,8; DB 33; Length 4601; 

Best Local Similarity 43.7%; Pred. No. 3.3e-10; 

Matches 1266; Conservative 0; Mismatches 1594; Indels 38; Gaps 1 

Qy 14 aatttttaatagtaaancctaaccaatttttaataataaagctgactcctagtacaagag 73 

III llllll II I II I II I I II III I I II II 

Db 1259 AATAATTAATAAAAATATTTTTAATATAATAAAAATTTAAAATGATTTTTTATAAAA-- - 1315 

Qy 74 cttttattcattcttctattttgctttcctctaggcttggcaatcgagaattttcttgtg 133 

II llllll Mil I I I I II II I I II I 

Db 1316 ATTCAATTCATATATTTATATATATATACATATAATTTAATTTTCAATTAAATTATATAA 1375 

Qy 134 ttacaatataataaatacatcgtagaaataaattttattcaaattgaagtcttaaccatc 193 

II Mil lllll I I I I Mill I I I II I II I I 

Db 1376 GTATAATAAAATAATTTATTTTAATCACTAAATCTGAATTAATTAATTGTATATATATAT 1435 

Qy 194 tttaatatttgtagatgtaatttaaatgaaagataaatacatattcttggacatgtattt 253 

I INI I II I III llll II I lllll lllll 

Db 1436 ATATATATATATATGTAAAATGAAMTAMTTTATTCCCCCTATTCATAAATTTATTATA 1495 

Qy 254 tcatcttaatgt — ttgtggctttggtgataggtgtattgatgtacgatgtcttttaa 309 

I II II I I III I I I III I lllll 

Db 1496 TAATTAAAACTTAAAAAAATATTTTTTTTAAAAAAAAAATTATTTATTAAATTATACTTA 1555 

Qy 310 atcacatatcacattttgagtttgtatgatgataagtcgacataancgaaatatggtgtg 369 

II I III I I II I I II I I II III 

Db 1556 ATAAACTATTTTTATAATAAATTATTTTATAAATAAAATTATTTAAAATAATTAATAAGA 1615 

Qy 370 atcttcacttttgaactttgataagtcaccaaactttaacaaagtttgattgtgtacata 429 

I I lllll I I II lllll III I Mil III 

Db 1616 AATATTTTTATTATAATAAAAATTAAAAATAATTTTTAAAAAATTCAATTTATATTTATA 1675 

Qy 430 tatatatatatcttcaaattttataataaaaattgtgtttaaataatttacagttatatt 48? 

milium i mm i i mi n i m n m i - 

Db 1676 TATATATATATATATATAATTTTAATTTTCAATTAAATTATATAAATATAATAAAATAAT 1735 

Qy 490 atttttttatctctaattttatttgtcgccaaatttttagttgatattttaacataaaaa 549 

I llll III llll II llll llll I I I lllll 

Db 1736 TTATTTTAATCACTAAATCTGAAATAATTAATTATATATATATATATATATATAAAAAAA 1795 

Qy 550 aaattgtacacatttacaagcccatatacaaataattatataaatattcattaaaaaata 609 

I II I I I I I I II I III lllll II llllllllll 

Db 1796 TGAAAATAAATTTATTCCCCCTATTCATAAATTTATTGTATAATTAAAACTTAAAAAATA 1855 

Qy 610 tatttaaatataggatataaatataactattttagaattattctactttaagataacata 669 

I III I I I I I I I II III I III I III I II 

Db 1856 TTTTTTTTTTAAAAAAAAATGATTTATTAAATTATACTTAATAAACTATTTTTATAATAA 1915 

Qy 670 ggttaaatgtataattaataaggttagtttattgtaaagatgagtatatatgtcgtaaac 729 

I I II I III I III llll II I I I I II I llll 

Db 1916 ATTATTTTATAAATAAAATTATTTTAAATAATTAATAAAAATATTTTTAATATAATAAAA 1975 

Qy 730 ataatcactaaccatttttattaacttcttggttttgaagttccaaaaagaaaatggaag 789 

II I I I III II II I I I I I I I I I II II 

Db 1976 ATTTAAAATGATTTTTTATAAAAATTCAATTCATATATTTATATATATATAC-ATATAAT 2034 

Qy 790 ggaaatttgagagtaagttcatgtttatattatacataatgaagttgatgttttcttctt 849 

II III I II I II Mil II I II I II II I III 

Db 2035 TTAATTTTCAATTAAATTATATAAGTATAATAAAATAATTTATTTTAATCACTAAATCTG 2094 

Qy 850 tttaatatttttatacaaaatatttaaataaaataattaaggattgaatgaaaaatataa 909 

I I III II I I I I I III I I llll III I 

Db 2095 AATTAATTAATTGTATATATATATATATATATATATATGTAAAATGAAAATAAATTTATT 2154 

Qy 910 tgaaagtcgttttactaatagtcatattgcattttgtcgcatctacttaaataatagata 969 

III III I I 'I I II II I I I I 

Db 2155 CCCCCTATTCATAAATTTATTATATAATTAAAACTTAAAAAAATATTTTTTTTTAAAAAA 2214 
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LSCCDYIYVLRKGEITYRCSYEDVKTQSELSHLLEMDD" 
rM 23896. .31533 

/gene- "rRNA" 

/note- "region containing small subunit, 5.8S and large 

subunit rRNA genes and spacer regions" 
gene 23896. .31533 

/gene- "rRNA" 
gene complement {31966. .32775) 

/gene»"MALlP3.04" 
CDS complement(join(31966. .32476,32675. ,32775)) 

/gene-"MALlP3.04" 

/note- "MAL1P3 . 04 , conserved hypothetical membrane protein, 
len: 203 aa, similarity: P, falciparum chromosome 2, 
PFB0110W, 096126 predicted integral membrane protein (255 
aa), fasta scores: opt: 335, E(): 4.9e-15, (36.1» identity 
in 191 aa overlap)" 
/codon_start-l 

/product-'conserved hypothetical membrane protein, 
MAL1P3.04" 

/protein Jd-"CAB63559.r 
/dbjcref-"GI:6594247" 

/trans lation-*MKKSYTFINVTILI<FIiTLLLFLTHNYDTFSKTKPNNNIKIDIN 
RFRRIIAEASEEQKYPWEEDFCLILNEEELIRPEHNDSPYLPEHYENIDRINELSINS 
TKIWKCTIKKMRQNYEKETDNMNHNWRDFMWHYKWANIYLYKVHKLINITLKDLTNPI 
HDREETITTWIRWIQEDIEYFLFNLQVEWLRILTtELFYKNRE" 
miscjeature complement! 32477, .32486) 
/gene-"MALlP3.04" 

/note- "potential splice acceptor sequence* 
miscjeature complement! 32669. .32674) 
/gene-"MALlP3.04" 

/note- "potential splice donor sequence, aaa/gtatat" 
gene 36657. .37343 

/gene-"MALlP3,05" 
CDS join(36657. .36743,36864. .37343) 

/gene-"MALlP3.05" 

/note- "MAL1P3 . 05 , hypothetical protein, len; 188 aa' 
/codon_start-l 

/product-'hypothetical protein, MAL1P3.05" 

/protein_id-"CAB63560.1" 

/dbjtref-"GI:6594248" 

/translation-'MRIRMNSGIFFIRLLICISFICVFECFNKCMISYRRDLLWYSEN 
CFNYSIDRSLAEGSSESKETKVKDIPNIELLKSLNINTEEYEKMKEIVGSPMDNNNLN 
IANEVLKNIHSFTNIENIFSLINDSSKSPVLRTFLKEFGSIFPHMLNNVPKLLFDLCQ 
RNPLHI ILGLIVILAAIYVFENFKNFEC " 
miscjeature 36744. .36749 

/gene-"MALlP3.05" 

/note- "potential splice donor sequence, aag/gtatga" 
miscjeature 36854. .36863 

/gene-"MALlP3.05" 

/note- "potential splice acceptor sequence" 
gene complement! 38049. .40284) 

/gene-"garp" 

CDS complement(join(38049. .39995,40210. .40284)) 

/gene-"garp" 

/note-'MALlP3.06, garp, len: 673 aa, similarity: almost 
identical to GARPJLAFF (678 aa), fasta scores: 97.61 
identity in 678 aa overlap" 
/codon_start-l 

/product- "hypothetical garp protein" 
/protein id-"CAB63561.1" 
/dbjcref-"Gl: 6594249" 

/trans lation-'MNVLFLSYNICILFFWCTLNFSTRCFSNGLLKNQNILNKSFDS 
ITGRLLNETELERNRDDNSRSETLLKEERDERDDVPTTSNDNLRNAHNNNEISSSTDP 
TNIINVNDKDNENSVDKKKDKKEKKHKKDKKEKKEKKDKKEKKDKKEKKHKKEKKHKK 
DKRREENSEVMSLYRTGQHRPRNATEHGEENLYEEMVSEINNNAQGGLLLSSPYQYRE 
OGGCGIISSVHETSNDTRDNDREKISEDRREDHQQEEMLRTLDRRERKQKEREMKEQE 
RIEKRRRRQEERERRRQERERRRQEKRERRQREREMRRQRRIERERRRREERERKKRK 
HDKEKEETMQQPDQTSEETKN3IMVPLPSPLTDVTTPEEHKEGEHREEEHREGEHKEG 
EHREEEHREEEHRKEEHKSREHRSRGKRDRGRRDRGRHKRMKERVRRHVVRNVIEDE 
DKDGVE I INLEDKEACEEQH ITVESRPL SQPQCKLIDEPEQLTLMDKS KVEEKNLS IQ 
EQLIGTIGRVNWPRRDNHRRRMARIEEAELQKQRHVDREEDKREESREVEEESKEVQ 
EDEEEVEEDEEEEEEEEEEEEEEEEEEEEEEDEVEEDEDDAEEDEDDAEEDEDDAEED 
DDDAEEDDDDAEEDDDEDEDEDEEEEEDEEEEEESEKKIRRNLRRNARI" 



miscjeature complement(39996. .40005) 
/gene- "garp" 

/note- "potential splice acceptor sequence" 
miscjeature complement (4 0204 . .40209) 
/gene- "garp" 

/note- "potential splice donor sequence, aag/gtaaca" 
gene 45401. .50233 

/gene-"MALlP3.07" 



Query Match 5.8%; 
Best Local Similarity 43.9%; 
Matches 1220; Conservative 



Score 175,8; DB 33; 
Pred. No. 7e-ll; 
1; Mismatches 1539; 



Length 67970; 
Indels 22; Gaps 10; 



Qy 139 atataataaatacatcgtagaaataaattttattcaaattgaagtcttaaccatctttaa 198 

III I I II I I III II lllll III I I i I 
Db 7824 ATAAATAATATTCCTTAATTATATTAACATTATTAAAAATATAATTAAGGAATAAATATT 7883 

Qy 199 tatttgtagatgtaatttaaatgaaagataaatacatattcttggacatgtattttcatc 258 

I II I I I II Mil I I INI I III III I II 
Db 7884 TTCTTAAAAAACGATGATATAATAAATAATTTAATATATATTAATTCATTTATATATATA 7943 

Qy 259 ttaatgtttgtggctttggtgataggtgtattgatgtacgatgtcttttaaatcacatat 318 

llll I I I I II I II II I II II I I III I III 

Db 7944 ATAATATATTTCACATTAATTTTAATAGTTTATATATA — ATTATATAATTTCTTTAT 7999 

Qy 319 cacattttgagtttgtatgatgataagtcgacataancgaaatatggtgtgatcttcact 378 

I II I II III I II II II I I I I I 
Db 8000 TAATAATTAATTATAGTACACATTTATATTAATAAATATAATTAAAGAATATITTATC1T 8059 

Qy 379 tttgaactttgataagtcaccaaactttaacaaagtttgattgtgtacatatatatatat 438 

III I III I IN II III I II III I III ••. 
Db 8060 TTGTATTATATATATATATATTAMTGGAATAAATTAAATAAATAAAATAATAAAAATAA 8119 

Qy 439 atcttcaaattttataataaaaattgtgtttaaataatttacagttatattattttttta 498 

I III I lllll II III I III I III II II I 

Db 8120 TATATTAAAATATATAAATTAACAAAATAATAATATAATTAAATAAATAAAATATTATAT 8179 

Qy 499 tctctaattttatttgtcgccaaatttttagttgatattttaacataaaaaaaattgtac 558 

I III I II II II I III llll M II I I I -' 

Db 8180 TAAATAAATAAAATAAAATAAAATATTAAAAAATATAAATTAATAATATAATTAATTAAA '8239 

Qy 559 acatttacaagcccatatacaaataattatataaatattcattaaaaaatatatttaaat 618 

I II I lllll II I II II III I HIM II 
Db 8240 TAAAATATTAAAATATATAAATTAATAATATAATTAATAAATAAAATATTATATTAAATA 8299 

Qy 619 ataggatataaatataactattttagaattattctactttaagataacataggttaaatg 678 

I I llllll I II I lllll I I llll I I II 

Db 8300 AATTAAATAAAATATTAAAATATATAAATTAATAATATAAITAATAAATAAAATATTATA 8359 

Qy 679 tataattaataaggttagtttattgtaaagatgagtatatatgtcgtaaacataatcact 738 

Mil I lllll I I I I I Ml II llll llll llll I I 

Db 8360 TATATTAAAIAAATTAAATAAAATATTAAAATATATAAATTAATAATATAATTAATAAAT 8419 

Qy 739 aaccatttttattaacttcttggttttgaagttccaaaaagaaaatggaagggaaatttg 798 

II II llllll II I II I III I I II II II I 

Db 8420 AAAATATTATATTAAATAAATTAAATAAAATATTAAAATATATAAATTAA' ■ TAATATAT 8477 

Qy 799 agagtaagttcatgtttatattatacataatgaagttgatgttttcttctttttaatatt 858 

iiii ii mini III II I I I II I III 

Db 8478 ATAATTAAAIAAATAAACTATTATATTAAATAAATTAATAATATAAITAATAAATAAAAT 8537 
Qy 859 tttatacaaaat-atttaaataaaataattaaggattgaatgaaaaatataatgaaagtc 917 

inn iiii i illinium i n n n n i n n i 

Db 8538 ATTATATTAAATAAATTAAATAAAATATTAAAATATATAAATTAATAATATATATAATTA 8597 

Qy 918 gttttactaatagtcatattgcattttgtcgcatctacttaaataatagataaattaatt 977 

I I II I lllll II I I I lllll I I llll I 
Db 8598 AATAAATAAAATATTATATTAAATAAATTAATAATATAATTAATAAATAAAATATTATAT 8657 

Qy 978 gtggtacattagatcaaagaacaaactagattttgtcccattctattgttaaaagctggt 1037 

II llll II III I III II II I I I II I III I 

Db 8658 TAAATAAATTAAATAAAATATTAAAATATAIAAATTAATAATATAATTAATAAATAAAAT 8717 
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Db 41588 AATTTTTTTTTTTTAMAATTAATATATATATGTTAATATTTATCTAACCCATTAAATTT 41529 



Qy 

Db 41! 

Qy 

Db 41/ 

Qy 

Db 41/ 

Qy 

Db 4i: 
Qy 

Db 4i; 
Qy 

Db 4i; 
Qy 

Db 41] 

Qy 

Db 411 

Qy 

Db 

Qy 

Db 40! 

Qy 

Db 40! 



.82 gtcttaaccatctttaatatttgtagatgtaatttaaatgaaagataaatacatattctt 241 

1 1 ii i ii mi i i mi ii i i inn i n n 

28 TTTTAAAAATATTATATTATTAATTCTTTTAATATACTGTATAAATAAAATGAATTTTTT 41469 

!42 ggacatgtattttcatcttaatgtttgtggctttggt—gataggtgtattgatgtacg 298 

II I II Ml I I III III I III II I 
68 TTTCAATTTAAAATATTTATATATGTAATTGTATAATAAGAATAATTATATAAATTAAAT 41409 

99 atgtcttttaaatcacatatcacattttgagtttgtatgatgataagtcgacataancga 358 

II I llll II I I II I II II I II III I I I I llll I 

08 ATATATTTTGTATAATAAATAATATATTAACTTAATATAAACAATATAACATATAATAAA 41349 

59 aatatggtgtgatcttcacttttgaactttgataagtcaccaaactttaacaaagtttga 418 

II I III II II I I I I I I III III I I I 
48 TTAATTTAGCAATTTATTAATTAAAATTATATTTAATT • CAAAAT ATTATAATATAAAT A 41290 

1 19 ttgtgtacatatatatatatatcttcaaattttataataaaaattgtgtttaaataattt 4 7 8 
I llll II III I II III I III llllll I! I I II I Ml 
ATAAATACACATCAATA-AAATAATCATTTATTAAAATAAATATAATAATAAATTTAATT 41231 

,79 acagttatattatttttttatctctaattttatttgtcgccaaatttttagttgatattt 538 

I llll III II II I I II I I! Ill I III II 

30 TTA--IATAATATAATTAAATATTATAAAATACAATTAAAATAAATTTCATAAAATAATT 41173 

139 taacataaaaaaaattgtacacatttacaagcccatatacaaataattatataaatattc 598 

II III I I III I III I I II I II III I lllll I 

.72 AAATATACATATAATAAATTAAATTAAATATAATATTAATAATTAAGTGTATAATATATA 41113 

attaaaaaatatatttaaatataggatat-aaatataactattttagaattattctactt 657 

III M I II I III llllllll II II I III I I 

.12 TATAATAAATATAAAATATTAAATATTATAAAATATAATTAAAATAAATTTCATAAAATA 41053 

1 taagataacataggttaaatgtataattaataaggttagtttattgtaaagatgagtata 717 

I lllll III II llll II I Mil I llll 
! ATTAAATACATATAATAATAATAATTTTAAATAAATATAATATTAATAATTAACTATATA 40993 

i tatgtcgtaaacataatcactaaccatttttattaacttcttggttttgaagttccaaaa 777 

II I llll I III III I llll I II II 

192 ATATATATATAAATAAAAATTAATTATTATATAATTAGATTTAATTATATTATTAAATTA 40933 

agaaaatggaagggaaatttgagagtaagttcatgtttatattatacataatgaagttga 8 3 7 
I III II II II II llllll llll I II I 

132 ATTAAAATAAAATAAATATTATATAATTAATTAAATTTATATATATTATAAATTAATTAA 40873 



Qy 838 tgttttcttctttttaatatttttatacaaaatatttaaataaaataattaaggattgaa 897 

III I 1 1 II 1 1 llllll llllll II M 
Db 40872 AATATAAT AAATATTATAAATT AAT TAAAATAT AATAAATGTTATATAATTAAAT 40818 

Qy 898 tgaaaaatataatgaaagtcgttttactaatagtcatattgcattttgtcgcatctactt 957 

I II Mil II III I II I III I III I I II I II 

Db 40817 TTAATTATATTATTAAATTAATTAAAATAAATATAATAAATATATAATTATAATTAAATT 40758 

Qy 958 aaataatagataaattaattgtggtacattagatcaaagaacaaactagattttgtccca 1017 

III III I I I II I I II II II llll I 
Db 40757 TAATTATATTATTAAATTAATTAAAATATAATAAATATTATATAATTAAATTTAAATATA 40698 

Qy 1018 ttctattgttaaaagctggtccgtttacattaaaataaggtacatgttacatgccacgta 1077 

II I I llllll llll llll II I I 

Db 40697 TTATTTAATTAAAATAAAATAAATGT - -AATAAATATTATATAATTAAATTTAAATATAT 40640 

Qy 1078 taactatctggttattctatcaatcacgctaatttttaacagtagaaatgaatgtaattt 1137 

II II llllllll I III II I I III 
Db 40639 TATTTAATTAATTAAAATAAAATAAATATAATAAAATAATTATATATAATAATAATTAAA 40580 

Qy 1138 ttaaatagaaagggtcaaattgttatttgatctaacacgtagggattaatttacttattt 1197 

llllll II llll I III III I III II III I 
Db 40579 TTAAATTAAATTAAATAAATATTAATTAAAATAAAATACTTTAAATATAAAIAATTAAAT 40520 

Qy 1198 tcctaaagaaataagtaaaatataatttgaatcttaatacaaaaactttcatgatacttt 1257 

I II I III lllllll III I II I llll I I I 

Db 40519 TAAATAMTATTAATTAAAATAAAATACTTAATAAGTATATAATAATIAAATTAAAIATA 40460 



Qy 1258 tatcatattttacttataatttaatattgtgagagtaacaaarttaaaaaacatagaaac 1317 

I I II III III llll I llll Mil I II I 
Db 40459 TTTAATTAAAATTATATTATTAAATAATTATTTAATTTTAATATTAATATTAATTAATTA 40400 

Qy 1318 accaaaagttagttatggtgtgactcatatacacagttaaaatttgaataaatttttttc 1377 

I II I II I II I I I I I II I I III II I 

Db 40399 AAGTTAAAAAATATAAAATAATTAAATTAAATATATTAATAAATAATAAAATATATTATA 40340 

Qy 1378 ttcgtcattaattccatcatgggtttttttttttctagttaagccataattatcaaa-at 1436 

II llll lllll II I II I I III I II 

Db 40339 TTATAATATAATATATAATTATAIATATTAAAATAAAATAAAATTAATAATATATTATAT 40280 

Qy 1437 aatcatcattaatcctatcaataccccgccctgcctccctccctcaatacttaaacccaa 1496 

II II llll II llll lllll III 

Db 40279 TATAATAATTATATATAATAATAATTAATTAATTTTAATTAAATTAAATTAAAACAGTAT 40220 

Qy 1497 ctaacacccagcaccaaacgcactttaatagccacctatttctagccatgtccttgcact 1556 

III I I II III llll I I I II I 

Db 40219 TTAATAATACGTGTGTGTAATATATATATATTATTTAATTTATTTAATAITGIATTCAAI 40160 

Qy 1557 taaagaaaagtaaagctaacctgcaatcattccat-atcgaggcctcaacagataaagtt 1615 

I llll II I II I II I I II II I I I I III II 
Db 40159 AATIAAAAAATATAAAIATGTTTCATAAAATAAATAATTAAAAATACTAIATATATTATT 40100 

Qy 1616 ggttgatgggtttgcaccaagttgttaaaacccggccctcaacttcccttttcttttcat 1675 

llllll lllll I II I lllll 

Db 40099 ATAAATTAATTATTTAATTATATATTAAATTAATTTAATAAAAIAAATAAATATGATAAT 40040 

Qy 1676 cctccccactccacaccctccaattttcttcatatggttctattataagttctttataat 1735 

I I llll I I Ml llll II I III 

Db 40039 TATTATATTAATTAAATTAATAATTAAAATGAATTAATTAATTTAT - • • TTATAAGTAAA 3998: 

Qy 1736 cacagaatcaagataagtcctcagcaaacaaaaaaccatggctctcgagcaagatctgga 1795 

II llll I I I lllll I II II I I 
Db 39982 CATTTAATTTATTTTAATTTAAATTATTTATTATAATTATTATTAAGAATTAATTATATA' 3992! 

Qy 1796 ctagtcagagctctgaatattggatcattattacagtcaaaaacagttaacaaaagctgt 1855 

I I I II I I III II II I II I I III I III II I 

Db 39922 TGAATTAAA-ATGTTATTATAATATTATAAATAAAAITAAATTTAATTATTAATTATATT:3986' 

Qy 1856 tgcagataaacactgaatctgctatagtttgtttttggtttacatatgttccacgtgaaa*1915 

II III III III III I I I I II I II 
Db 39863 TTAAAATATTATTTAATTTAATTATTATTTTAATATATATATTAAGTGGAATAAAATAAT 398Q4 

Qy 1916 ctatgaagcatctctaagaaaacccaaactatcatatcaacccatcgatcaatgaatcga 1975 

II I II I II I I II III I I III II I 

Db 39803 AAATTATTTATTTTATTATTAATATATTTATTTATTATAACTGCTTATTTAATTCATTTA 3974- 

Qy 1976 tttcaattttcgcagtataagttccttttaatcctttctttttacttcattttataacga 2035 

lllll I III II II II II III llll II I I 
Db 39743 ATTAATATATATTATATTAATTTGAATTATTAAITTATTTATTATAAAATTTAATTAATA 3968' 

Qy 2036 attctatggataatgttccctacaaacatgtcattacaatgtttaattataaattccatt 2095 

I I II llll llll II I I II I II I 

Db 39683 AAAAACATTTATTTATTTAAATTATTAATTTAAITAATATATATTATATAATATATTTAT 3962< 

Qy 2096 cttctattttactaagatattagtaacttcaaactgctgatttttactaatttattattt 2155 

I I III I I I I II II II II I II 1 1 1 1 1 1 1 1 1 II 
Db 39623 TTGAATTATTAATTTATTTATTATTTATTTTAATTAATAATATATATTAATTTATTTATT 3956< 

Qy 2156 ataaattgttagaatgattatttttcaataatttaacaacaatatttaatattattatta 2215 

ill iiii ii in ii i mi ii i mil i inn 

Db 39563 TTAATIATTTA TTTAITATTTATATTAATTAATAATATATAITAATTTATTTATTT 39508 

Qy 2216 ttattatttctcaatttttattaaacaaaaacataaatttttgacaaattaaaataaatg 2275 

I llllll I lllllll I III III II II I III II 

Db 39507 TAATTATTAATTTATTTTTATTTTAATAAATAATATATATTAATTTATTTTTATTTTAAT 39448 

Qy 2276 aattaatttctcaatttttcgtgcaactattacaaaaatccttcatagtcctaatcttaa 2335 

II llll II llll I I I I III I Ml II I III 

Db 39447 AAATAATATAIATTAATTTATTTTTATTTTAATAAATAATATATATTAATTTATTTTTAT 39388 
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Db 9105 TGTTTTATTTATTAAATTATTATATTAATATATTTTTATTTATTTAAAATATTATATTAA 9046 

Qy 1712 gttctattataagttctttataatcacagaatcaagataagtcctcagcaaacaaaaaac 1771 

I II I I III I II I I I III I I II 

Db 9045 ATAATAAAGACAATATATTAAACATATAAATTAATATAATATTTTATTTTATTTATTAAT 8986 

Qy 1772 catggctctcgagcaagatctggactagtcagagctctgaatattggatcattattacag 1831 

I I I II I II I I III I Ml I I 
Db 8985 TTGTTTAATATATTATTATTTTATTTAATTATTTATTTAATATATTATTATTTTATTTAA 8926 

Qy 1832 tcaaaaacagttaacaaaagctgttgcagataaacactgaatctgctatagtttgttttt 1891 

III II I I I II I I I I I llll I I II II 
Db 8925 TTATTTAATTATATTATTATTTTATTTATTAATTTATATATTTTAATATATTAITTTATT 8866 

Qy 1892 ggttt acatatgttccacgtgaaactatgaagcatctctaagaaaacccaaact 1945 

III I II II I I I I II I I I I II II I 
Db 8865 TATTTAATATAATATTTAATTTATTTAATATAATAATATTTTAATTATTTAATATAATAT 8806 

Qy 1946 atcatatcaacccatcgatcaatgaatcgatttcaattttcgcagtataagttc c 2000 

llll I I I I II I I I I II III I llll II 
Db 8805 TTCATTTAATTCATTTAATATAATATTTTAATTATATTATTAATTTATATATTTTAACAT 8746 

Qy 2001 ttttaatcctttctttttacttcattttataacgaattctatggataatgttccctacaa 2060 

III I III III I II III I II III I II I I 
Db 8745 TTTATTTAATTTATTTAATATATATAATATTTTATTTATTAATTATATTATTAATTTATA 8686 

Qy 2061 acatgtcattacaatgtttaattataaattccattcttctattttactaagatattagta 2120 

iiii ii i iiii ii iiii inn i iiiiii n 

Db 8685 TATTTTAATATTTTATTTAATTTATTTAATATAATATTTTATTTATTAATTATATTATTA 8626 

Qy 2121 acttcaaactgctgatttttactaatttattatttataaattgttagaatgattattttt 2180 

I II I II III I II I llll III III I IIIIII 

Db 8625 ATTTATTTAATATAATATTTTATTTATTTAATTATATATATTATTAATTTATATATTTTA 8566 

Qy 2181 caataatttaacaacaatatttaatattattattattattatttctcaatttttattaaa 2240 

ii i i i ii i in mi i linn n i n in 

Db 8565 ATATTTTATTTAATTTATTTAATATAATATTTTATTTATTAATTATATTATTAATTTATT 8506 

Qy 2241 caaaaacataaatttttgacaaattaaaataaatgaattaatttctcaatttttcgtgca 2300 

II I III II II I III III II I I I II I III I 

Db 8505 TAATATAATAGTTTATTTATTTAATTATATATATTATTAATTTATATATTTTAATATTTT 8446 

Qy 2301 actattacaaaaatccttcatagtcctaatcttaatttgatgcagaggtgataataatct 2360 

II I I III I I IIIIII II I I III I 
Db 8445 ATTTAATTTATTTAATATAATATTTTATTTATTAATTATATTATTAATTTATATATTTTA 8386 

Qy 2361 taatttgatgcagaggtaataatgggccgggtttgagctggactta—-agcatgatat 2416 

llll II I llll II llll I II I I 
Db 8385 ATATTTTATTTAATTTATTTAATAIATATAATATTTTATTTATIAATTATATTAITAATT 8326 

Qy 2417 tgacgtactttatatttttccaaattcaacccagctcgaaatatgagtctaaaattttgt 2476 

I II Mini llll I II III I I II I I I 
Db 8325 TATATATTTTAATATTTTATTTAATTTATTTAATATAATATTTTATTTATTAATTATATT 8266 

Qy 2477 ccaatttaatccaagcccattttaagttcgtccatattattttttaatttaaaaaattta 2536 

Ml II I II II II II II llllllll I llll 
Db 8265 ATTAATTTATATATTTTAATATTTTATTTAATTAATTATATTATTAATTTATATTTTTTA 8206 

Qy 2537 tatcattttattttaatatttaattattttatatattttttatttattgaaaatttttat 2596 

IIIIII IIIIII III I II III II II I III llll 
Db 8205 ATATTTTATTTTATTTTATTTATTTAATATAATATTTTATTTATTTAATTATATTATTAT 8146 

Qy 2597 atagtcatcttaacattatgttaatgtttatattagagtagtattatatatatttagtat 2656 

I II I III II II II I III I llll I lllll I 
Db 8145 TTTGTTAATTTATATATTTTAATATATTATTTTTATTATTTTATTTATTTAATTTATTCC 8086 

Qy 2657 aggtttattttgttaataaacttaaaaatgggtcttgtgggctagacttggaccttaaat 2716 

I llll I I I III I lllll I II I II I I I 
Db 8085 A-TTTAATATATATATATATAATACAAAAGATAAAATATTCTTTAATTATATTTATTAA 8028 

Qy 2717 gctcaaactcaaacttaattcatattttaaacaggcttaatatttttatttacactgttt 2776 

I II I I lllll II II I I II I llll llll II I I I ' 

Db 8027 TA1AAATGTGTACTATAATTAATTATTAATAAAGAAATIAIATAATTATATATAAACTAT 7968 ■' 



Qy 2777 caaatttttcgggtgaaatatcttcgagtctagattaataacaccacaggtctaatttga 2836 

III lllll llll I llll lllll llll 
Db 7967 TAAMTTAATGTGAAATATATTATTATATATATAAATGAATTAATATATATTAAATTATT 7908 

Qy 2837 tgctcaatga 2846 

Mill 

Db 7907 TATTATATCA 7898 
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Bowman, S., Lawson,D., 
Churcher,C.M., Craig, A., 
Gentles, S,, Gwilliam,R,, 
Hornsby,T., Horrocks,P., 
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REFERENCE 
AUTHORS 

JOURNAL 
REFERENCE 
AUTHORS 
TITLE 



FEATURES 

source 



PFMAL3P5 86829 bp DNA INV ll-FEB-2000 

Plasmodium falciparum MAL3P5, complete sequence. 

AL034556 AL008971 ALO08972 AL008978 AL010141 AL010153 AL010162 

AL010206 AL010210 AL139179 

AL034556.2 61:4493931 

HTG. 

malaria parasite P. falciparum. 
Plasmodium falciparum 

Eukaryota; Alveolata; Apicomplexa; Haemosporida; Plasmodium, 

1 (bases 1 to 86829) 

!ham,D,, Brown, D., Chillingworth,T., 
Davies,R.M, Devlin, K. , Feltwell,!., 
Hamlin, N., Harris,D., Holroyd,S,, 
Jagels,K., Jassal,B., Ryes,S,, McLean, J. , 
Moule,S., Mungall,K. ( Murphy,L., Oliver, K., Quail, M, A., 
Rajandream,M.-A., Rutter,S., Skelton,J., Squares, R., Squares, S,, 
Sulston,J.E., Whitehead^., Woodward, J. R., Newbold,C. and 
Barrell, B.G. 

The complete nucleotide sequence of chromosome 3 of Plasmodium 
falciparum 

Nature 400 (6744), 532-538 (1999) 
99376085 

2 (bases 1 to 86829) 

Bowman, S., Skelton,j., Churcher,C, Lawson,D,, Quail, M. and- 

Barrell,B. 

Unpublished 

3 (bases 1 to 86829) 
Lawson,D., Bowman, S. and Barrell,B. 
Direct Submission 

Submitted (17-DEC-1998) P. falciparum Genome Sequencing Consortium, 
The Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambridge 

CB10 ISA, UK 

On Mar 24, 1999 this sequence version replaced gi: 4034877. 

For more information about this sequence or the Malaria Project, 

see http : //www , Sanger , ac . uk/Proj ects/P_f alciparum , 

Location/Qualifiers 

1. .86829 

/organism-'Plasmodium falciparum" 

/strain-" 3D7" 

/db_xref-"taxon:5833" 

/chromosome-"3" 

/clone-"MAL3P5" 

324. .2944 

/gene-'MAL3P5.1" 

join(324. .668,1199. .1303,1460. .2944) 
/gene«"MAL3P5.1" 

/note* "predicted using hexExon; MAL3P5.1 (PFC0575w) , 

Hypothetical protein, len: 645 aa" 

/codon_start-l 

/protein_id-"CAB38969.1" 

/dbjtref- '61: 4493933" 

/db_xref-"SPTREMBL:097258" 

/translation="MYLKNVYIYISSCFILFDLCFSFHLLKMKYKNHMNNMKSVTFFL 
RSPQIYRRRFRRSRIRNVSFKKRQRKPLFLFENLRKGFSFLGFWRNQYDQVNKRERRK 
KKKKKKKKKKNPKVHSILNQISEKVKEKKDAENYLALHLFLLKDENITLFSMMHIMDF 
FKSKQRVIECIRDIRSKRKRRKNLSIYINLFICTLIYFTYCMCLLIRYISHLCIFFFF 
FFCFFLCYNILERIYEECVGDLIRRKIERYEYCERKRIKFHMKDAIRKMEINMKDDD 
LYFNYHYDELLRCFTMKLNIERNNKNIIRSNYDNINNDISIDKDMYMNNPIDVNINNI 
SLDEKIKEQFENPDDENLRELRDTYEQFQLFNDNIIKYIEEDQPLYNINDNSNINDNN 
NNINTMKNRHKIKDTYNDDDDYDYEKEEDLVIQKNIDDYIYKNTIGMNKSLEEFKNQF 
IEQADIEFQNFLSNVNLDQHGRVKSNDENTRSTEHIKNRNTINKGYDTELIQNQMENN 
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TITLE Direct Submission 

JOURNAL Submitted (24-SEP-1998) P, falciparum Genome Sequencing Consortium, 
The Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambridge 
CB10 ISA, OK 

COMMENT On Dec 16, 1999 this sequence version replaced gi:5763807, 

For more information about this sequence or the Malaria Project, 
see http://Ww.sanger.ac.uk/Projects/Pjalciparun. IMPORTANT: This 
sequence is unfinished and does not necessarily represent the 
correct sequence. Work on the sequence is in progress and the 
release of this data is based on the understanding that the 
sequence may change as work continues. The sequence may be 
contaminated with foreign sequence from E.coli, yeast, vector, 
phage etc. 

FEATURES Location/Qualifiers 
source 1. ,67970 

/organism-'Plasmodium falciparum' 

/strain-"3D7" 

/db_xref-"taxon:5833" 

/chromosome-" 1" 
gene complement (1748. .3276) 

/gene-'MALlP3.01" 

CDS complement! join(1748. .2598,2748. .2848,2990, .3276)) 

/gene-"MALlP3.01" 

/note-"MALlP3 .01, conserved hypothetical protein, len: 412 
aa, similarity: UPF0006 family eg to 
YBL055C/YBL0512/YBL0511, YBF5.YEAST (418 aa), fasta 
scores: opt: 316, E(): l.le-12, (33.2% identity in 271 aa 
overlap) 0 
/codon_starfl 

/product- "conserved hypothetical protein, UPF0006 family" 
/protein Jd-"CAB63556,1" 
/db_xref-"GI: 6594244" 

/translation- "MKLVFHYIKYINVLFYISIIFLKSNSLKIYNDLRYISTVNKYKV 
LQIKKRSNLKKNHNIRKMEDNESSFIDIGSNLTDKMFDGVYNSKKHENDLQNVLNRM 
NNNVDRIIITCTCLAEIDKSLRICETYDPEGKFLYLSAGVHPTNCYEFIDKNKHEEKE 
IIAKKEYEEFIKYFKNEQVENSKMENGNKKICDGEKDMNNLNEILLEKNLDTIPGFKY 
NEKDREYLENLRNKIIKYPKRIVCIGEIGLDFDRLYFCSKYIQIKYFIFQLKLVQMFN 
LPMFLHMRNCSETFFKIVDIYKFLFEKNGGVIHSFTDREDIVHIIVQNYKNLYIGVNG 
CSLKSLENINAVKKIPLNLLLLETDAPWCGVKKTHASYEYIKDTYEKRAYTNLKKIKN 
I IRCDDNT IFKERNEPYNIA" 
miscjeature complement(2599, .2610) 
/gene-"MALlP3.01" 

/note- "potential splice acceptor sequence" 
miscjeature complement (2742. .2747) 
/gene-"MALlP3,01" 

/note- "potential splice donor sequence, atg/gttaaa" 
miscjeature complement 2849. .2861) 
/gene-"MALlP3,01" 

/note-'potential splice acceptor sequence" 
miscjeature complement 2984. .2989) 
/gene-"MALlP3.01" 

/note- "potential splice donor sequence, aaa/gtaaaa" 
gene 5005. .5496 

/gene-"MALlP3.02" 
CDS 5005. .5496 

/gene-"MALlP3,02" 

/note-"MALlP3.02, hypothetical protein, len: 163 aa, 
contains possible signal sequence" 
/codon_start-l 

/product- "hypothetical protein, MAL1P3.02* 
/protein Jd-"CAB63557.1" 
/db_xref-"GI:6594245" 

/translation-'MRLLNNRFVVLCPIIILFFFLNSWLGNNNRNNINFHETENAAK 
AMRKLLSGEINSIKLDNGDELKIKLNDEKHKDSTKWDKSYSFISNLEEEKYSQTDLFR 
KKQEINEANTKIIEDRQEFYILNNDEIENIATRFVLENNFDEIYIQSFKQSLIDIIQS 
LNN" 

miscjeature 8020. .10389 ' 

/note- "possible cenl, region of very high [a+t] content" 

gene 14884'. .20352 

/gene-"MALlP3.03" 

CDS 14884. .20352 

/gene-"MALlP3,03" 

/note-"MALlP3.03, putative ABC transporter, len: 1822 aa" 



/codon_start-l 

/product- "putative ABC transporter" 
/proteinJd-"CAB63558.1" 
/db_xref-"GI: 6594246" 

/translation-"MTTYKENVGISNKGNRRRRSCQNISFLNFLSFDWIRPLINDLIR 
GDIQELPNICRNFDVPYYASRLEENLRDIEVEDSEFYSERNSSNEHVLHHCNSNDASE 
KKVYNVYYHNILWSILKTFKFRIILIISFYILETLIVTLGGRFIDYYMRILEGQKIPV 
YISFLRDFRVFSGLVWMIMFFHLFFEALLHFYFHLFTINLKVSLMYFLYKINLCSNN 
NfiLQNPDAFYNTYRRFSSQTEIDEISRDFLSIGRNASSSSSGIKNNNRNIDNNRFVEN 
DYIINFIKSTRRMERDSLNENRSLPNVNIYNIMFSDVPSVTFFVTSCIEFNVFVRIF 
MSFYV7HIKIGSNSVGIAIWLSIALYSAMILFEFLPSLFRSRYLIYRDRRIDNMHHVL 
REFKLIRMFNWESFAFKYINIFRMKEMRYCKIRLYLSNIGVFISSISSDIVEWIFFI 
YLKDRLNKKEEI KFT S I IMPLYVYKI L I SNV ANFPNLVNNVMEG I VN I KRLNNY I NDH 
LYYNDIRNYFMYRTRYNEDYNIWDRTFLQNENITSHDDGTSHNLKHLRNVIKNRLTN 
MFRYFFFYHRMNYHKNIINKQILSGLLKNVDDNTNKKICFQEHKSNSTYNYNSSHIHE 
RKEEYENIHNSSNSTMSNEFKERKKNNEYIIRLENCSFGLSYDNKCDNDHILKNINFN 
LKRNSLAIIIGNVGSGKSAFFHSILGDFNMTHGNLYIENFFKKMPILYVPQNSWLFMG 
NIRSMILFGNEYNPLIYKYTILQSELLNDLSTIEHGDMRYINDDHNLSRGQRVRICLA 
RALYEHYIHMHKLCTDYEKKLIQPNEILDKDLINNKNISSYNNRKSRLVNYNIPFNEN 
YLQKCLMDDNNFYLYLLDD I FTSLDPSISKK I FSNLFCKEDNI SFKDNCSFI I SMNKS 
TLDNFLIEDILDNVQYEVNIFEIQDKTLKYRGNISEYMEKNNLNITKESHWGYSNLNT 
IDYTRIRLFDEVELNHVRHSNRMIYREAYFVRGNTESVSFEIDSINKEYIRKMKKKNY 
RKEHMMNNRDNNNNNNNSNRDDHINIKMNDNHRNYNDINLGPNSTDDSPTVSSLGNE 
YTLDTYTSNNSDREEIVKPLYKDTHEEFNKSSSMPFVKSSSNNINNPSNFRYEDNSSS 
FKGSISLETYLWYFQQVGFVLLTSWIFMLISIFTDEIKFVFLTMMSIISRNNKEHSD 
TILQRQVRYLEYFVILPIISLVTSGICFSMIIYGNITSAIRVHNNILYSILNAPLYIF 
YNNNLGNIINRFIIDISAFDYGFLRRIYRAFFIFFRCILSSLLIIYMIRDCIFIFPFV 
IILIYFFVFRRFSRGCREAQRLYLSCHTPLCNIYSNALSGKNIINIYRKNTYHLDVYE 
HYINNFRISYFFRWLINIWASLYIRIFILLLTTYIIMHPHLYASGIIRLYRERNYVRI 
LSTLGYCISFSARLGVIIRFLLCDYTHIEREMCCVQRLEEFAKISNKENASMNKENEL 
NVITTOTYRERNENISDKISAIVEYKNVSLSSIINSSQDDESRRKYGIRFENVTVSYK 
KRIPLVNGTYRYIDEEPSLKNINMYALKNQKIGIVGRSGAGRSTILLSILGLINISQG 
RITVEGRDIRTYHRRGEDSIIGILAQSSFVFYNWNIRTFIDPYNNFTDDEIVHALRLN 
GINLGKNDLYRYMHRQDMRSNYKKIIQTSRVINQSNDNTILLTNDCIRYLSLVRLYLN 
RHRYRIILIDEIPIFNLNNSVHDELNSFLIGRARSFNYIIRNHFPNNTVLIISHHANT 
LSCCDY I YVLRKGEIT YRCS YEDVKTQSELS HLLEMDD " 
23896. .31533 

/gene-"rRNA" r 
/note-"region containing small subunit, 5.8S and large 
subunit rRNA genes and spacer regions" w "* 
23896. .31533 

/gene- "rRNA" ^ 
complement(31966. .32775) 
/gene-"MALlP3.04» 

complement(join(31966. ,32476,32675. .32775)) 
/gene-"MALlP3.04" 
/note-"MALlP3.04, conserved hypothetical membrane protein, 
len: 203 aa, similarity: P. falciparum chromosome 2, 
PFB0110W, 096126 predicted integral membrane protein (255 
r aa), fasta scores: opt: 335, E(): 4.9e-15, (36.1% identity 
in 191 aa overlap)" 
/codon_start-l 

/product- "conserved hypothetical membrane protein, 
MAL1P3.04" 

/protein Jd-"CAB63559.1 n 
/db_xref-"GI:6594247" 

/trans lation-'MRKSYTFINVTILLFLTLLLFLTYYNYDTFSRTRFNNNIKIDIN 
RFRRIIAEASEEQRYPWEEDFCLILNEEELIRPEHNDSPYLPEHYENIDKINELSINS 
TRIWRETIKRMRQNYEKETDNMNHNWRDFMWHYRWANIYLYRVHKLINnLKDLTNPI 
HDKEETITTWIRWIQEDIEYFLFNLQVEWLRILTLELFYRNKE" 
miscjeature complement{ 32477, ,32486) 
/gene-"MALlP3.04" 

/note-"potential splice acceptor sequence" 
miscjeature complement(32669. .32674) 
/gene-"MALlP3.04" 

/note-"potential splice donor sequence, aaa/gtatat" 
gene 36657. .37343 

/gene-"MALlP3.05" 
CDS join(36657. .36743,36864. .37343) 

/gene-"MALlP3.05" 

/note-"MALlP3.05, hypothetical protein, len: 188 aa" 
/codon_start-l 

/product-'hypothetical protein, MAL1P3. 05" 



rRNA 



gene 



CDS 
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NIDYIENNIIYSYIKSFKRTPPITKiyLLGTFLLSVLIHMNKNVYKLILFDPNKIFKK 
GE IWRLFT PYLY IGNLYLQYILMFNYLNI YMSSVE ISHYKKPEDFLI FLTFGY I SNLL 
FTIWANMYNENIMNVKLYIHNFKNFFIKDCVSKYTSRSSTNNNSNNINSNNRSSNNNN 
HYNNSKNIDIKKEQYNHLGYVFSTYILYYWSRINEGTLINCFELFFIKAEYVPFFFII 
QNILLYNEFSLYEVASIFSSYLFFTYEKYFKFNYLRLFFKTLLKVIHIYPLYDRFQMN 
TKLLISILKKRKNGPLPFLQKCYIHNVPRINTMKHMNISDLRKNIEVCNKKNVKHKNV 
WSNFLYIIILKLFNKEIKNYVDFMIILKLLSKYIKIEKKVLLYICEQIEHEIYKFRTR 
DLTLLILILRKNNFDNIYYINLISKSILMKMNKNMSYKDLALIIYSLSKNIYLTDEQI 
YNKE IFNFS ILKFENHLNNVNINLHSLSLFFYS YSVYF INNCFYY YYYFHSFFNI IT K 
FINIINKNLHLYNSTDLMFLYIGSLHIHNMYTPNHVDQNKEPKNNQKENNNYHNDNHN 
IYLKNINNNCYDHRLDSNDFITMTNYDQGEYNKHIQQNKHIOONKHIOQNKHIQQNRH 
IQRIGTHCTESNSNNQQLIQIQNDEKENKLITYDNSKHNLLKDPCQHNIVEKDGEKKQ 
NLIKNLIINIKKIIEEKLSSFKIQEIVNILFVSLNKNIIINKKYFHFLNQEKINIRNY 
INIYVNINKIYLNDEEENTSHCILKIKNDNKKDILYHDHMKFLYNLMNEIIYRNDLLN 
MKQI I LLLYGLKFNNFMFLQFEKI I LKRFICLPKKEIQKIGKEEIMFLYQYFFVRTCL 
FNELKKQNNLFISQDEYENYIYISDKYNESAKLDNSYNMPSNLKEKNTNHHGGKDNTL 
DLYIHDDLFYMNKNKKRDRYKIYLYDNFIFNYPAYYVEQKKDHIDYNESVNNFDNMKS 
FIQLKKKKIMINNNNNNNNNNNNNIYIDTNIQTVNKNYSCTHNNVIKNETNDNYPNS 
TIRNQHPNDQVH.NNPVFFYNKKLNWDSIDFEYELTCYNLYLDIYKIVCLKLLTLLK 
NHKLSCLQSIDILCIYEKLNIRDYRI IKYLYNLKKELLYLDNTYLLKVINI IVKFNLY 
NMISYLQINKILTFINYNNINESIQILKLIGMLISVHKHNKLSPPHMNNLNVQNAANY 
LFKNLYNLQNIQDLKKIEMMNVYDNLTFKFYKLFKNILSINVKRYVQNCNSYNKYEMN 
THTM-NKNEQHKYIHHNlTOHKDGRHNNNNNHYDKVDVSSSSSSSYyyYLNKSGRNLG 
NINVQNLDDINIHIKSISYKIKKDOIKDIGYMRVSKYSELMKSMKMMNYDEHFNDEY 
RNVCDEIYEDLFLIYNKNIQVYKNINICNYTFPMAINLLTLNNDENILININKSDDNK 
KLIKVDKKKFLIVDILYNYDYYYTLTKSKI.DKLKEYNIYLSYYSNHIKKKNKKILNYK 
KYALLKLIKKRGFNYICIDADTYVKNKKGKSKDLSYEINKLYINNLILDILKRQKKNH 
LHPHPHTQNRTTKQIKNINIKNKLLLYHQNKKNVKKIIHFKNYKYKIMNLPDQRNHYH 
NKRIKYIKDKSLLAINHKTKNIIEKQKISTSNHLSKLKRMFSL" 

gene complement (20528. .21454) 

/gene-"MAL3P5.5" 

CDS coraplement(20528. .21454) 

/gene-"MAL3P5.5" 

/note-"predicted using hexExon; MAL3P5.5 (PFC0595c), 

Serine/threonine protein phosphatase (PP2), len: 309 aa; 

Similarity to serine/threonine protein phophatases. 

M.domestica serine/threonine protein phosphatase 

(TR:Q42912) BLAST Score: 1005, sum P(l) - 6.9e-107; 604 

identity in 301 aa overlap." 

/codon.start-1 

/protein Jd-"CAB38970.1" 

/dbjtref-'GI: 4493934" 

/dbjtref-"SPTREMBL:097259" 

/translation- "MAKGEERKWIEQLRMNPPKLLDESDLRLVCQRVKEILVEENNVQ 
SIKPPVIICGDIHGQFFDLLELFDVGGDIMNNDYIFLGDYVDRGYNSVETFEYLLLLK 
LLFPKNITLLRGNHESRQITTVYGFYDECFKKYGNANAWKYCTDIFDYLTLAALVDNQ 
IFCVHGGLSPEIKLIDOLRLINRVQEIPHEGAFGDIMWSDPDEVDDWVANPRGAGWLF 
GPNVTKKFNHINNLELIARAHQLAMEGYRYMFEDSTIITVWSAPNYCYRCGNVAAIMR 
I DEYMNRQMLIFKDTPDSRNS IKNKAT I PYFL * 

gene 25252. .26157 

/gene-"MAL3P5.6" 

CDS join(25252. .25296,25453, .26157) 

/gene-"MAL3P5.6" 

/note-"predicted using hexExon; MAL3P5.6 (PFC0600w), 

Hypothetical protein, len: 250 aa" 

/codon_start-l 

/protein id-"CAB38972.1" 

/dbjcref-"GI: 4493936" 

/dbjcref-"SPTREMBL:097261" 

/translation-'MKKYLNKYMYIYNIYMLEEKYRNFLRLRNMNSHMGASQNMNVN 
NNYTMNELEEFEKINNNYNNNNNNINNNINNYYDYMNIKVSQSVQHNKRLQDFYNNKN 
S FQHY IKKLKTCRFDADDI RNLLEKRLAYERDNTLIKNIQEEENKKG IG INGNFGSES 
NSSSSNYDNNYLLYRKINRLNKTNTNKSKNRSRKRKRINSKIDKKYIIKCRACKFINP 
KGFKIEDYYTCQNCGYNDFSVIRSTSPNNAD" 

gene 27547. .28290 

/gene-"MAL3P5.7" 
CDS 27547. .28290 

/gene-"MAL3P5.7" 

/note* "predicted using hexExon; MAL3P5.7 (PFC0605C), 
Hypothetical protein, len: 248 aa" 
/codon_start-l 
/protein id-"CAB41709,r 
./ /dbjcref-"GI: 4725991' 



/db_xref ■ " SPTREMBL : Q9 YOU " 

/translation-'MGGHGGLNILPQKKWNVYRRDAQYKVHYDEHRIIKEEKDKEIKR 
KRDEFESTISTLKKNMTKNEDSDNNYNNFYDENGEKKTTTNYCNDHINLFIDEEKELT 
ARQKKHEEFLIRRGHYIYYDRNFNTQHNSIYDRNKNAQIISDFNKMKLCERDWFLNRK 
NKNEKTKDNGANFFHIQKDNISEEHNKTENINSDLSLYCNTNNYITHDKKKEKKQMHY 
HIKKIIKYKQEKDKEKKRKRQGKEKKKPK" 

gene complement(29992. .33537) 

/gene-"MAL3P5.8" 

CDS complement (29992. .33537) 

/gene-"MAL3P5,8" 

/note- "predicted using hexExon; MAL3P5.8 (PFC0610c), 

Hypothetical protein, len: 1182 aa" 

/codon_start-l 

/protein_id-"CAB38971.1" 

/dbjtref-"GI:4493935" 

/dbjcref-"SPTREMBL:O97260" 

/translation-'MAHKVKREKKTEAQETPWAKEQTHAKEENNESNIAVTEENVIS 
KNGQEIAISKNDQEIAISKNDQEIAISNNDQEIAISKNDQENVALNSSEERQNASKEE 
DNELRQIREFHDISNENEHNENRSFSTSTLSSFFREYEENSVEQHFFSEGTHTEHSME 
DSNNVETIENAITNDVLRSNRSTSYSKQKNELTSVTCYVCGETVDLNIWSDHIFAHKL 

Query Match 6,1%; Score 185,6; DB 33; Length 86829; 

Best Local Similarity 45.6*; Pred. No, 6.2e-12; 

Matches 1207; Conservative 1; Mismatches 1396; Indels 45; Gaps 14; 

Qy 138 aatataataaatacatcgtagaaataaattttattcaaattgaagtcttaaccatcttta 197 

III I II II I I III I I III I I I I I II II 

Db 38731 AATTTCATTAAAGTTACAAATAAAATATATATATAGTATAAAGAAATATTAAGAACTATA 38790 

Qy 198 atatttgtagatgtaatttaaatgaaagataaatacatattcttggacatgtattttcat 257 
Mill II II I III I II Mill I I III II v. 
Db 38791 TTATTTATATATACATTTTTTTATTTAATTATTAATATATAAATAATATTTAATATACAA 38850 

Qy 258 cttaatgtttgtggctttggtgataggtgtattgatgtacgatgtcttttaaatcacata 317' 

I II I Mil III I III II I II II I II 

Db 38851 AT ■ • ATAAAICATGCTTTTAAAATAAATATATAATTTTTATTAATATATAAATATAAGTA 38908 

Qy 318 tcacattttgagtttgtatgatgataagtcgacataancgaaatatggtgtgatcttcac 377 " 

II I I II .11 II I I I III Ml 

Db 38909 TTATTTATATGTTATAATACATTTTATTTTAATTCTTTCTAATTTAATATTGATAACATA 38968 

Qy 378 ttttgaactttgataagtcaccaaactttaacaaagtttgattgtgtacatatatatata 437' 

Ml I IMI I I II II II III I II II Ml 

Db 38969 TTTTTATAATTAATTTACACATTTATTAAAAAAACATTATATTAATAATATTAATTTATA 39028 

Qy 438 tatcttcaaattttataataaaaattgtgtttaaataatttacagttatattattttttt 497 

II II MM I III II II III I II II I 

Db 39029 TTTTAAAATAAAATATAAATATTTAATAAAATAATAAAAAAAAATATATGTAATAGTTAT 39088 

Qy 498 atctctaattttatttgtcgccaaatttttagttgatattttaacataaaaaaaattgta 557 

II I Ml III I MM II III I I Mil II I I II 
Db 39089 ATATATAATATTAAATTAATATAAATTAATATAATATAAT- • ■ AAATAAATAATAATATA 39145 

Qy 558 cacatttacaagcccatatacaaataattatataaatattcattaaaaaatatatttaaa 617 

I III I I MM I I III I MM I I II III I I II 
Db 39146 TATATTAAATAAATAAAATAAACAAAATAAATTAAATTATTTTAAATTAATTAAATAAAT 39205 

Qy 618 tataggatataaatataactattttagaattattctactttaagataacataggttaaat 677 

I I II III I II IMI I II III I I II 

Db 39206 AAAATATATTATTTATTAAAATAAATAAATTAATATATATTATTTATTAAAATAAAAATA 39265 

Qy 678 gtataattaataaggttagtttattgtaaagatgagtatatatgtcgtaaacataatcac 737 

Ml III Ml II III I MM Ml III 

Db 39266 MTTAATATATATATTTATTAAAATAAAAATAAATTAATATATATTATTTATTAAAATAA 39325 

Qy 738 taaccatttttattaacttcttggttttgaagttccaaaaagaaaatggaagggaaattt 797 

II I II II II II II I III II II I III 

Db 39326 AAATAAATTAATATATATTATTTATTAAAATAAAAATAAATTAATATATATTATTTATTA 39385 

Qy 798 gagagtaagttcatgtttatattatacataatgaagttgatgttttcttctttttaatat 857 

I II I II MM II III III I I I I I II III 
Db 39386 AAATAAAAATAAATTAATATATATTATTTATTAAAATAAAAATAAATTAATATATATTAT 39445 
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Matches 1194; Conservative 



Mismatches 1448; Indels 33; Gaps 10; 



Qy 125 tttcttgtgttacaatataataaatacatcgtagaaataaattttattcaaattgaagtc 184 

in i ii I ii i in ill ii mini I! mil i i i 

Db 6879 TTTTTATAGTAAAAAAAATATATATATATATATATAATMATGATAAACAAATAGGATTA 6938 

Qy 185 ttaaccatctttaatatttgtagatgtaatttaaatgaaagataaatacatattcttgga 244 

I I II II! I I III! Illl I! I I I III I 
Db 6939 AAATGTTACCTTTTTATATATTTTCTGAATTAAAAT--AATTTCTTTCCCAATTTCTAAT 6996 

Qy 245 catgtattttcatcttaatgtttgtggctttggtgataggtgtattgatgtacgatgtct 304 

II I! II I III I I I I III I Illl II II I 

Db 6997 TATATAAAAATATATATATGATATTATCATATTATATAATTATATTTATTTAAATAATAA 7056 

Qy 305 tttaaatcacatatcacattttgagtttgtatgatgataagtcgacataancgaaatatg 364 

I I I I I I I I I I I II I I I III III I! 
Db 7057 TAATATTTATTGAAAATAAATATCTTMGGAAAATMCATATATTTATATTAGAAGTAAA 7116 

Qy 365 gtgtgatcttcacttttgaactttgataagtcaccaaactttaacaaagtttgattgtgt 424 

II I I II II I I II II II I 
Db 7117 AGAAAATTATTTTTATTTAATATAAAAAAAGTTATATATAAATTAATATGCATAAATATA 7176 

Qy 425 acatatatatatatatcttcaaattttataataaaaattgtgtttaaataatttacagtt 484 

llllll II I III Illl I Illl III Illl I II I 
Db 7177 TAATATATTAATTAAATAATAATTAAATTAATTATAATTATGTATAAAATAATTTATATA 7236 

Qy 485 atattatttttttatctctaattttatttgtcgccaaatttttagttgatattttaacat 544 

ii I mm i ii nun i n in i 1 m n n 

Db 7237 TTAATTAATTTTIATTTATATATTIATTAATTTAATTATACTTAAATAAAATIATATAAT 7296 

Qy 545 aaaaaaaattgtacacatttacaagcccatatacaaataattatataaatattcattaaa 604 

I I II II III II II I I I I I I II I I 
Db 7297 TATTTATTTTATATTTATTATTAAATTAATTAATATAAACAATATAATAATTTATTATAG 7356 

Qy 605 aaatatatttaaatataggatataaatataactattttagaattattctactttaagata 664 

I II MINIMI I III II II I I I I I II III 

Db 7357 GTAATTAATTAAATATATTAAATATATTAAAAATAATAATATATTTAATATAAAACAAAA 7416 

Qy 665 acataggttaaatgtataattaataaggttagtttattgtaaagatgagtatatatgtcg 724 

I I II III llllll I II I I III I III I I 
Db 7417 ATTAATATTTAATAATTAATTATATAATATATATATATATAATATTATTTATIATTATTT 7476 

Qy 725 taaacataatcactaaccatttttattaacttcttggttttgaagttccaaaaagaaaat 784 

III II II I llllll II I I I II I I I I I 

Db 7477 TAATATAAAAATATATTATAATATATTAATTATTACACTGTTATTTTATATATTAATATT 7536 

Qy 785 ggaagggaaatttgagagtaagttcatgtttatattatacataatgaagttgatgttttc 844 

II II I I I I II Illl III I II III I 

Db 7537 ATAATTTATTTAATATATAATATATATATTTAATTTAATAAAAAAGAAAATA T 7589 

Qy 845 ttctttttaatatttttatacaaaatatttaaataaaataattaaggattgaatgaaaaa 904 

I I Illl II III III I II I II III I II II Ml III 

Db 7590 ATATGTTTATTAATTTAATAAATAAATTATATATATATTACATATATAATTTAACTTAAA 7649 

Qy 905 tataatgaaagtcgttttactaatagtcatattgcattttgtcgcatctacttaaataat 964 

III II I II I II I I I I I lllll 

Db 7650 ATAAATTAATAATAAATAATTTCATTTATTACCATTTAATTTAATAATAATAAAAATATA 7709 

Qy 965 agataaattaattgtggtacattagatcaaagaacaaactagattttgtcccattctatt 1024 

MUM II I Ml I II lllll I III II 
Db 7710 TTTTAAATTAAAATTAATTATAAATTAATTATAAATATATATATTTTATATAATTTAATA 7769 

Qy 1025 gttaaaagctggtccgtttacattaaaataaggtacatgttacatgccacgtataactat 1084 

llllll III III III I I II I II I I 
Db 7770 TTTAAAATATAATTATTAATTTATAATAAAAATAATAAAATATAAATATATTAATATATT 7829 

Qy 1085 ctggttattctatcaatcacgctaatttttaacagtagaaatgaatgtaatttttaaata 1144 

I I III II I I Illl I I I II III I I II I 
Db 7830 ATTTTAATTAATCTATTTGTTATGTATAATAATAAAACTATTATATTTWftTATATAAAA 7889 

Qy 1145 gaaagggtcaaattgttatttgatctaacacgtagggattaatttacttattttcctaaa 1204 

Illl II II II I II I I III I II! ,! 
Db .7890 TGTGTATATAAATAAATAAITCTTTTTAATTTTATTTTTATAATTAAAITATTTTATTTA 7949 



Qy 1205 gaaataagtaaaatataatttgaatcttaatacaaaaactttcatgatacttttatcata 1264 

lllll III!! I III I III II I I I I I I 
Db 7950 ATTATTA-TTATATATATATAATTATTTATTTTTTAAATGTTAAATAAATAATAAAATAA 8008 

Qy 1265 ttttacttataatttaatattgtgagagtaacaaarttaaaaaacatagaaacaccaaaa 1324 

III III lllll I I I III II III I Illl Illl 
Db 8009 TAATATAAATTAATTAATTAAATAAAAATAATAATTATAATTATAATATATTTACATATA 8068 

Qy 1325 gttagttatggtgtgactcatatacacagttaaaatttgaataaatttttttcttcgtca 1384 

I I I Illl III III lllll I 
Db 8069 AAATATATATATTATATATATITAATTTATTATTTTTTATTTAAAIGAATAATAAATAAT 8128 

Qy 1385 ttaattccatcatgggtttttttttttctagttaagccataattatcaaaataatcatca 1444 

II I Ml III I II II I III I Ml Illl I 
Db 8129 TAATTATATTAATTTAATTTATATTAAATATTATTTATATATATTTAAATTAAATTAATA 8188 

Qy 1445 ttaatcctatcaataccccgccctgcctccctccctcaatacttaaacccaactaacacc 1504 

Illl Illl II I I II lllll I II II 

Db 8189 TATTTATTTTTAATA AACTTAATTAATTAATTATTATTTATAAAAAAAATATATT 8243 

Qy 1505 cagcaccaaacgcactttaatagccacctatttctagccatgtccttgcacttaaagaaa 1564 

I I I lllll I III I II I I II I lllll I I 

Db 8244 AAATTATGATATATATATAATATTAATATATATTTAATTAATTAATTAAAATTAAAAATA 8303 

Qy 1565 agtaaagctaacctgcaatcattccatatcgaggcctcaacagataaagttggttgatgg 1624 

III II I Ml II II II II II II 

Db 8304 TATTAAATIAITTTTTAATTAAATTATTATATATTATTTATGAGTCIATTTATTTTATTT 8363 

Qy 1625 gtttgcaccaagttgttaaaacccggccctcaacttcccttttcttttcatcctccccac 1684 

II I III llllll I II III Illl I I I 

Db 8364 TTTATAATTAATTAATTAAAATTATATATTTAATAATATATTTTGTTTCTTTGTATTAAT 8423 

Qy 1685 tccacaccctccaattttcttcatatggttctatt ataagttctttataa 1734 

I I Illl I II II III Illl Illl. 

Db 8424 ATATTATTATTTTATTTATATTATTAAATTAAATTATATTAATTAATAAAAATTATAATA 8483 

Qy 1735 tcacagaatcaagataagtcctcagcaaacaaaaaaccatggctctcgagcaagatctgg 1794 

II II I I I I I I Illl II I I I I II 
Db 8484 TTATTTCATTTATTTTAATATATATTTATATAATAATAATTATTTCCAATATAAATAAAT 8543 ■ 

Qy 1795 actagtcagagctctgaatattggatcattattacagtcaaaaacagttaacaaaagctg 1854 

II I I II II Illl I I I I I I MM 

Db 8544 AATGTATTTATTAAAGTTAATATAATTATTAATTTATTTTTATAAAATAAATATTGCTTT 8603 

Qy 1855 ttgcagataaacactgaatctgctatagtttgtttttggtttacatatgttccacgtgaa 1914 

III I II I Illl II lllll II Illl I I I 
Db 8604 AATTCATTAATTATAATATTTATTATA-TTATTTTTAAAATAATTATGATATAATATTA 8661 

Qy 1915 actatgaagcatctctaagaaaacccaaactatcatatcaacccatcgatcaatgaatcg 1974 

III I III I II II I II II I II I 
Db 8662 TATATATTAATAATTTAATTAGTATATAAAAAITTAAACATATACATTATAATTTACACA 8721 

Qy 1975 atttcaattttcgcagtataagttccttttaatcctttctttttacttcattttataacg 2034 

I MM llllll lllll I III Illl II III I 

Db 8722 TAATTATTTATTTTATTTTAATIAATITTTACITATTTTAATTTAATTATTTTATTTTAT 8781 

Qy 2035 aattctatggataatgttccctacaaacatgtcattacaatgtttaattataaattccat 2094 

III II I I III II Illl II II II I 

Db 8782 ATTATTTATTTATTATTTTAATTAATTTATTTAATATATTAATTTATTTTTATTTTATTT 8841 

Qy 2095 tcttctattttactaagatattagtaacttcaaactgctgatttttactaatttattatt 2154 

II I III M Ml I III III Ml I I lllll I 

Db 8842 ATTTATTTTTATTTTATTTAITTATTTATTTATTITTATITTATTTATTTAATTATTTTA 8901 

Qy 2155 tataaattgttagaatgattatttttcaataatttaacaacaatatttaatattattatt 2214 

III II II I Illl III I Illl I llllllllll I II 

Db 8902 ATTAATTTATTTAATATATTAATTTATTTTTATTTTATTTATTTATTTAATATATTATTT 8961 

Qy 2215 attattatttc-tcaatttttattaaacaaaaacataaatttttgacaaattaaaataaa 2273 

III llllll I Illl Illl I II I I I Illl I I I 
Db 8962 ATTTTTATTTTATTTATTTATTATITAAATTTATTTATTTAATATATTAATTTATTTTAT 9021 
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/transl_table-5 

/product- "cytochrome c oxidase subunit III" 

/protein_id-"AAC47816.1" 

/db_xref-"GI: 1166535" 

/translatiOD-'MSTHSNHPFHLVDrSPWPLTGAIGAMTTVSGMVKWFHOyDISLP 

VLGNIITILTVYQWWRDVSREGTYQGLHTYAVTIGLRWGMILFIIiSEVLFFVSFFWAF 

FHSSLSPAI ELGASWPPMG I ISFNPFQ I PLLNT AI LLASG VTVTWAHHSLMENNHSQT 

IQGLFFTVLLGIYFTILQAYEYIEAPFTIADSIYGSTFFMATGFHGIHVLIG1TFLLV 

CLLRHLNNHFSKNHHFGFEAAAWYWHFVDWWLFLYITIYWWGG" 

5543. .5607 

/gene-"ntt:ND6" 



Query Match 6.4%; 
Best local Similarity 45.3%; 
Matches 1291; Conservative 



Score 193.8; DB 58; Length 19517; 
Pred, No. l,3e-12; 

0; Mismatches 1498; Indels 59; Gaps 14; 



Qy 87 ttctattttgctttcctctaggcttggcaatcgagaattttcttgtgttacaatataata 146 

II Mill II I II II II I I I II II I II 
Db 18579 TTATATTTCATTATAAAATTTATTTATTAAAAATTTTTTGTTTATTTTTTAAAAAACATG 18520 

Qy 147 aatacatcgtagaaataaattttattcaaattgaagtcttaaccatctttaatatttgta 206 

1 1 ii ii inn iiiiii mi i mi mini i 

Db 18519 ATTTTATTATATAAATATTTTTTATAAAAATAATACATTTAAGAAATTTTTAAA A 18465 

Qy 207 gatgtaatttaaatgaaagataaatacatattcttggacatgtattttcatctta-atgt 265 

II II IIIIII I I I I II! II I II III I II III II 

Db 18464 AATTTATATTAAATTATTTAAATAATTTAATTTTTCTATATATATATATATATTATATAA 18405 

Qy 266 ttgtggctttggtgataggtgtattgatgtacgatgtcttttaaatcacatatcacattt 325 

III III I III II IMI II II III llll 

Db 18404 AIATTCAATAATATATAAATTTATAAATATATAATAATTAATTAAATCATTATATTATTT 18345 

Qy 326 -"tgagtttgtatgatgataagtcgacataancgaaatatggtgtgatcttcacttttg 382 

llll III llll I I II I llll llll 

Db 18344 ATATAAATTAAATTAATAATAAATAAATATGAGAATATAAATTTTTATAAATTATATCTA 18285 

Qy 383 aactttgataagtcaccaaactttaacaaagtttgattgtgtacatatatatatatatct 442 

I III II II I llll II II I II I IIIIII I 
Db 18284 CATTTTTAAATTTTAAAATTTTTTATTTAAATTATTAGATATATAATAATATATTAAATA 18225 

Qy 443 tcaaattttataataaaaattgtgtttaaataatttacagttatattatttttttatctc 502 

I I I llll II I I I I IIIIII I II II lllll I 
Db 18224 TTTATATATATATAAATATCTATTAATTTATAATTAGTATATAGTTTTTTTTTAAAAAAA 18165 

Qy 503 taattttatttgtcgccaaatttttagttgatattttaacataaaaaaaattgtacacat 562 

llll I III I llllllll I III II lllll III I III 
Db 18164 AAATTATTTTTTTTAAAAAATTTTTTAAAAAAATTGAAAAATAAATAAATTAIATTTCAT 18105 

Qy 563 ttacaagcccatatacaaataattatataaatattcattaaaaaata tatt 613 

I II II II II llll I I llll llllllll I III 

Db 18104 TATAAAATTTATTTATTAAAAATTTTTTGTTTATTTTTTAAAAAACATGATTTTATTATA 18045 

Qy 614 taaatataggatataaatataa-ctattttagaattattctactttaagataacataggt 672 

inn iiiiii iiii iiii iiii iiii iii i i 

Db 18044 TAAATATTTTTTATAAAAATAATACATTTAAGAAATTTTTAAAAAATTTATATTAAATTA 17985 

Qy 673 taaatgtataattaataaggttagtttattgtaaagatgagtatatatgttgtaaacata 732 

II II I I I lllll llll llll III 

Db 17984 TITAAATAATTIAATTTTTCTATATATAIATATATATTATATAAATATTCAATAATATAT 17925 

Qy 733 atcactaaccatttttattaacttcttggttttgaagttccaaaaagaaaatggaaggga 792 

I II II I II III I II II I I I llll II I 
Db 17924 AAATTTATAAAIATATAATAATTAATTAAATIATTATATTATTTATATAAATTAAATTAA 17865 

Qy 793 aatttgagagtaagttcatgtttatattatacataatgaagttgatgttttcttcttttt 852 

lllll I I I II II I I I I I llll III 
Db 17864 TAATAAATAAATATGAGAATATAAATTTTTATAAATTATATCTACATTTITAAATTTTAA 17805 

Qy 853 aatatttttatacaaaatatttaaataaaataattaaggattgaatgaaaaatataatga 912 

III llll III II III IIIIII II llll lllll I 

Db 17804 AATTTTTTATTTAAATTATTAGATATATAATAATATATTAAATATTTATATATATATAAA 17745 
Qy J)13 aagtcgttttactaatagtcatattgcattttgtcgcatctacttaaataatagat---a 969 



I II II I I III llll I I I II II I I I 

Db 17744 TATCTATTAATTTATAATTAGTATATAGTTTTTTTTIAAAAAAAAAATTATTTITTTTAA 17685 

Qy 970 aattaattgtggtacattagatcaaagaacaaactagattttgtcccattctattgttaa 1029 

II II I III II II II III II MM I I II III 

Db 17684 AAAATTTTTTTTTAAAAATGAAAAATAAATAAATTATATTTCATTATAAAATTTATTTAT 17625 

Qy 1030 aagctggtccgtttacattaaaataaggtacatgttacatgccacgtataactatctggt 1089 

I I I I III II II II I III I I 

Db 17624 TAAAAATTTTTTGTTTATTTTTTAAAAAACATGATTTTATTATATAAATATTTTTTATAA 17565 

Qy 1090 tattctatcaatcacgctaatttttaacagtagaaatgaatgtaatttttaaatagaaag 1149 

II lllll I llll II I I I II II II II III II 

Db 17564 AAATAATACATTTAAGAAATTTTTAAAAAATTTATATTAAATTATTTAAA1AATTTAATI 17505 

Qy 1150 ggtcaaattgttatttgatctaacacgtagggattaatttacttattt tcctaaa 1204 

lllll I II I I I II I llll II llll I II 

Db 17504 TTTCTATATATATATATATATTATAAATATTCAATAATATATAAATTTATAAATATATAA 17445 

Qy 1205 gaaataagtaaaatataatttgaatcttaatacaaaaactttcatgatacttttatcata 1264 

II III llll III I II II I lllll II II II I I 

Db 17444 TAATTAATTAAATTATTAAAAAAAAAAAAAAAAAAAAATAATTTTITATTATTAATTAAA 17385 

Qy 1265 ttttacttataatttaatattgtgagagtaacaaarttaaaaaacatagaaacaccaaaa 1324 

I III I llll I III II I I II I I III II I I I 

Db 17384 TATTAGTAATAAATAAATTTTATTTAATTATTAGTTATTAAATTTATTTATTATTAATTA 17325 

Qy 1325 gttagttatggtgtgactcatatacacagttaaaatttgaataaatttttttcttcgtca 1384 

II I II II III I II II I I III I I III II I I 

Db 17324 AATATTAATAATG AATAGATTTTATTTAATTATTAATTAITAAATTTATTTATTA 17270 

Qy 1385 ttaattccatcatgggtttttttttttctagttaagccataattatcaaaataatcatca 1444 

llll I llll llll I I III III I lllll I I 

Db 17269 TTAAAATTTAAAAATTATTTTCATTTTAATATATATATATATATATATATATAATTTTTA 17210 

Qy 1445 ttaatcctatcaataccccgccctgcctccctccctcaatacttaaaccca-actaaca 1502 

lllll I I II I I llll III I III ■ 

Db 17209 TTAATTATTTTAAATAATTTTATTTATAAAATAATTTATTATAAAAATAGTTTATTAAGT 17150 

Qy 1503 cccagcaccaaacgcactttaatagccacctatttctagccatgtccttgcacttaaaga 1562 

I II I III I I I I I I II I I I I - 

Db 17149 ATAATTTAATAAATAATTTTTTTTTAAAAAAAAATATTTTTTTAAGTTTTAATTATATAA 17090 

Qy 1563 aaagtaaagctaacctgcaatcattccatatcgaggcctcaacagataaagttggttgat 1622 

II I I I III I I I I III I I I 

Db 17089 TAAATTTATGAATAGGGGGAATAAATTTATTTTCATTTTACATATATATATATATATATA 17030 

Qy 1623 gggtttgcaccaagttgttaaaacccggccctcaacttcccttttcttttcatcctcccc 1682 

I I II I II I I I I III II I II I II II , 

Db 17029 TATATACAATTAATTAATTCAGATTTAGTGATTAAAATAAATTATTTTATIAIACT — 16972 

Qy 1683 actccacaccctccaattttcttcatatggttctattataagttctttataatcacagaa 1742 

Ml I II I I I III II llll I III 

Db 16973 - • TATATAATTTAATTGAAAATTAAATTAT ATGTAT ATAT ATAT AAATAT ATGAATTGAA 16916 

Qy 1743 tcaagataagtcctcagcaaacaaaaaaccatggctctcgagcaagatctggactagtca 1802 

I llll lllll II I I I II I I 

Db 16915 TTTTTATAAAAAATCATTTTAAATTTTTATTATATTAAAAATATTTTIA1TAATTATTTA 16856 

Qy 1803 gagctctgaatattggatcattattacagtcaaaaacagttaacaaaagctgttgcagat 1862 

II II I II III lllll llll I II III 

Db 16855 AAATAATTTTATTTATAAAATAATTTATTATAAAAATAGTTTATTAAGTATAATTIAATA 16796 

Qy 1863 aaacactgaatctgctatag tttgtttttggtttacatatgttccacgtgaaact 1917 

II II I I I I I IIIIII llll III III 

Db 16795 AATCATTTTTTTTTAAAAAAAAAATATTTTTTAAGTTTTAATTATACAATAAATTTATGA 16736 

Qy 1918 atgaagcatctctaagaaaacccaaactatcatatcaacccatcgatcaatgaatcgatt 1977 

II I III llll llll I II II II I I I 

Db 16735 AIAGGGGGAATAAATTTATTTTCAITTTITTATATATAIATATATATATATAATTAATTA 16676 

Qy 1978 tcaattttcgcagtataagttccttttaatcctttctttttacttcattttataacgaat 2037 

I I II I III I II I llll I llll II III 
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Qy 1445 ttaatcctatcaataccccgccctgcctccctccctcaatacttaaaccca - -actaaca 1502 

III!! I I II I I I Ml Ml I III 

Db 2293 TTAATTATTTTAAATAATTTTATTTATAAAATAATTTATTATAAAAATAGTTTATTAAGT 2234 

Qy 1503 cccagcaccaaacgcactttaatagccacctatttctagccatgtccttgcacttaaaga 1562 

I II I III I I I I I I II I I I I 
Db 2233 ATAATTTAATAAATAATTTTTTTTTAAAAAAAAATATTTTTTTAAGTTTTAATTATATAA 2174 

Qy 1563 aaagtaaagctaacctgcaatcattccatatcgaggcctcaacagataaagttggttgat 1622 

II I I I III I I I I III I I I 
Db 2173 TAAATTTATGAATAGGGGGAATAAATTTATTTTCATTTTACATATA1ATATATATATATA 2114 

Qy 1623 gggtttgcaccaagttgttaaaacccggccctcaacttcccttttcttttcatcctcccc 1682 

I I II I II I I I I II I II I II I II II 

Db 2113 TATATACAATTAATTAATTCAGATTTAGTGATTAAAATAAATTATTTTATTATACT — 2056 • 

Qy 1683 actccacaccctccaattttcttcatatggttctattataagttctttataatcacagaa 1742 

I I I I II I I I III II Mil I III 

Db 2057 "TATATAATTTAATTGAAAATTAAATTATATGTATATATATATAAATATATGAATTGAA 2000 

Qy 1743 tcaagataagtcctcagcaaacaaaaaaccatggctctcgagcaagatctggactagtca 1802 

I llll III I I II I I I II I I 

Db 1999 TTTTTATAAAAAATCATTTTAAATTTTTATTATATTAAAAATATTTTTATTAATTATTTA 1940 

Qy 1803 gagctctgaatattggatcattattacagtcaaaaacagttaacaaaagctgttgcagat 1862 

II II I II III III llll I II III 

Db 1939 AAATAATTTTATTTATAAAATAATTTATTATAAAAATAGTTTATTAAGTATAATTTAATA 1880 

Qy 1863 aaacactgaatctgctatag tttgtttttggtttacatatgttccacgtgaaact 1917 

II II I I I I I II III I llll III III 

Db 1879 AATCATTTTTTTTTAAAAAAAAAATATTTTTTAAGTTTTAATTATACAATAAATTTATGA 1820 

Qy 1918 atgaagcatctctaagaaaacccaaactatcatatcaacccatcgatcaatgaatcgatt 1977 

III III llll Mil I II II II I I I 

Db 1819 ATAGGGGGAATAAATTTATTTTCATTTTTTTATATATATATATATATATATAATTAATTA 1760 

Qy 1978 tcaattttcgcagtataagttccttttaatcctttctttttacttcattttataacgaat 2037 

I I II I III I III llll I llll II III 
Db 1759 TTTCAGATTTAGTGATTAAAATAAATTATTTTATTATATTTATATAATTTAATTGAAAAT 1700 

Qy 2038 tctatggataatgttccctacaaacatgtcattacaatgtttaattataaattccattct 2097 

I I III I II I MM I II II I Ml I II II 
Db 1699 TAAAATTATATATAT * - ATATATATATATAAATATAAATTGAATTTTTTAAAAATTATTT 1642 

Qy 2098 tctattttactaagatattagtaacttcaaactgctgatttttactaatttattatttat 2157 

I lllll III I llll II I MM I llllll I I II 

Db 1641 TTAATTTTTATTATAATAAAAATATTTCTTATTAATTATTTTAAATAATTTTATTTATAA 1582 

Qy 2158 aaattgttagaatgattatttttcaataatttaacaacaatatttaatattattattatt 2217 

II III II I II III III I I llll llll II II III 

Db 1581 AATAATTTATTATAAAAATAGTTTATTAAGTATAATTTAATAAATAATTTTTTTTTTAAA 1522 

Qy 2218 attatttctcaatttttattaaacaaaaacataaatttttgacaaattaaaataaatgaa 2277 

I II I II llll I I llllll! Ill I lllllll I 

Db 1521 AAAAATATTTTTTTAAGTTTTAATTATATAATAAATTTATGAATAGGGGGAATAAATTTA 1462 

Qy 2278 ttaatttctcaatttttcgtgcaactat tacaaaaatccttcatagtccta 2328 

II II I II I I III I III llll I I 

Db 1461 TTTTCATTTTACATATATATATATATATATATATATACAATTAATTAATTCAGATTTAGT 1402 

Qy 2329 atcttaatttgatgcagaggtgataataatcttaatttgatgcagaggtaataatgggcc 2388 

III I II I III llll I II II II III II I 
Db 1401 GATTAAAATAAATTATTTTATTATACTTATATAATTTAAITGAAAATTAAATTATATGTA 1342 

Qy 2389 gggtttgagctggacttaagcatgatattgacgtactttatatttttccaaattcaaccc 2448 

II III II III I I III II II I 

Db 1341 TATATATATAAATATATGAATTGAATTTTTATAAA- : -AAATCATTTTAAATTTTTATTA 1285 
Qy 2449 agctcgaaatatgagtctaaaattttgtccaatttaatccaagcccattttaagttcgtc 2508 

i mm i mm ii i i i ii ii 

Db 1284 TATTAAAAATATTTTTATTAATTATTTAAAATAATTTTATTTATAAAATAATTTATTATA 1225 
Qy 2509 catattattttttaatttaaaaaatttatatcattttattttaatatttaattattttat 2568 



i n iii mi i ii ii mini! inin i n mm i 

Db 1224 AAAATAGTTTATTAAGTATAATTTAATAAATCATTTTTTTTIAAAAAAAAAATATTTTTT 1165 
Qy 2569 atattttttatttattgaaaatttttatatagtcatcttaacattatgttaatgtttata 2628 

i mi i ii iiiiii i nn iii iiii ii ii in ii 

Db 1164 AAGTTTTAATTATACAATAAATTTATGAATAGGGGGAATAAATTTATTTTCATTTTTTTA 1105 

Qy 2629 ttagagtagtattatatatatttagtataggtttattttgttaataaacttaaaaatggg 2688 

I II HUM III I llll II I llll I II II 
Db 1104 TATATATATATATATATATAATTAATTATTTCAGATTTAGTGATTAAAATAAATTATTTT 1045 

Qy 2689 tcttgtgggctagacttggaccttaaatgctcaaactcaaacttaattcatattttaaac 2748 

I Ml II I llll I I I I II MM I II 
Db 1044 ATTATATTTATATAATTTAATTGAAAATTAAAATTATATATATATATATATATATAAATA 985 

Qy 2749 aggcttaatatttttatttacactgtttcaaatttttcgggtgaaatatcttcgagtcta 2808 

II I llllll I I III I III I Ml II II II 
Db 984 TAAATTGAATTTTTTAAAAATIAITTITAATITTTATTATAATAAAAATATTTCTTATTA 925 

Qy 2809 gattaataacaccacaggtctaatttgatgctcaatgaaaatgaaatcatattgagctta 2868 

I llll III II I III II I I llll II II 
Db 924 ATTATTTTAAATAATTTTATTTATAAAATAATTTATTATAAAAATAGT1TATTAAGTATA 865 

Qy 2869 attaatattccattcttctttgctgaaa 2896 

III I II II III III 
Db 864 ATTTAATAAATAATTTTTTTTTAAAAAA 837 



RESULT 3 
DMU37541/C 
LOCUS 

DEFINITION 
ACCESSION 
VERSION 
KEYWORDS 
SOURCE 
ORGANISM 



TITLE 

JOURNAL 

MEDLINE 



AUTHORS 
TITLE 



JOURNAL 
MEDLINE 



AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 



TITLE 

JOURNAL 
MEDLINE 
REFERENCE 



DMU37541 19517 bp DNA circular INV 04-APR-2000 
Drosophila melanogaster complete mitochondrial genome, 
U37541 

U37541.1 61:1166529 

Drosophila melanogaster. *'~ 
Mitochondrion Drosophila melanogaster 
Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta," 
Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; 
Muscomorpha; Ephydroidea; Drosophilidae; Drosophila. 

1 (bases 12511 to 12682) 

Clary, D.O., Goddard, J.M. , Martin, S,C, Fauron,C.M. and 
wolstenholme,D.R. 

Drosophila mitochondrial DNA: a novel gene order 
Nucleic Acids Res. 10 (21), 6619-6637 (1982) 
83090428 

2 (bases 5269 to 5695) 

Clary, D.O., Wahleithner, J.A. and Wolstenholme,D.R. 

Transfer RNA genes in Drosophila mitochondrial DNA: related 5' 

flanking sequences and comparisons to mammalian mitochondrial tRNA 

genes 

Nucleic Acids Res. 11 (8), 2411-2425 (1983) 
83220794 

3 (bases 404 to 5272) 
de Bruijn,M.H. 

Drosophila melanogaster mitochondrial DNA, a novel organization and 
genetic code 

Nature 304 (5923), 234-241 (1983) 
83245048 

4 (bases 804 to 1778) 

SattaJ., Ishiwa,H. and Chigusa,s.l. 

Analysis of nucleotide substitutions of mitochondrial DNAs in 

Drosophila melanogaster and its sibling species 

Mol. Biol. Evol. 4 (6), 638-650 (1987) 

88174373 

5 (bases 5268 to 13619) 
Garesse,R. 

Drosophila melanogaster mitochondrial DNA: gene organization and 
evolutionary considerations 
Genetics 118 (4), 649-663 (1988) 
88212147 

6 (bases 441 to 2967) 
SattaJ. and Takahata,N. 
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25 


134.8 


4.4 130117 


39 


AC004907 


AC004907 Homo sapi 


26 


133.8 


4.4 158116 


60 


AC006281 


AC006281 Plasmodiu 


27 


133.2 


4.4 2426 


8 


SDD49822 


U49822 Saccharomyc 


28 


133 


4.4 178087 


39 


AC005089 


AC005089 Homo sapl 


29 


132.2 


4.3 106650 


39 


AC007708 


AC007708 Homo sapi 


30 


132.2 


4.3 176552 


39 


AC004617 


AC004617 Homo sapi 


31 


132.2 


4.3 203519 


40 


CNS01RHQ 


AL162191 Homo sapi 


32 


131.8 


4.3 321003 


31 


PFMAL4P3 


AL035476 Plasmodiu 


33 


131,2 


4.3 80518 


31 


PFMAL13PA 


AL109815 Plasmodiu 


34 


130.6 


4.3 170125 


41 


AC007465 


AC007465 Homo sapi 


35 


130 


4.3 321003 


31 


PFMAL4P3 


AL035476 Plasmodiu 


36 


129.6 


4.3 282806 


60 


AC0O6279 


AC006279 Plasmodiu 


37 


129.4 


4.2 158116 


60 


AC006281 


AC006281 Plasmodiu 


38 


129.2 


4.2 224448 


31 


PFMAL4P4 


AL035477 Plasmodiu 


39 


129 


4.2 318221 


31 


PFMAL13P3 


AL049184 Plasmodiu 


40 


128.8 


4.2 106650 


39 


AC007708 


AC007708 Homo sapi 


41 


128.8 


4.2 176552 


39 


AC004617 


AC004617 Homo sapi 


42 


128.6 


4.2 12029 


34 


AE001372 


AE001372 Plasmodiu 


43 


127.6 


4.2 75076 


39 


AC004948 


AC004948 Homo sapi 


44 


127.6 


4.2 204652 


31 


PFMAL13P6 


AL049183 Plasmodiu 


45 


127.2 


4.2 12029 


34 


AE001373 


AE001373 Plasmodiu 



RESULT 1 

S79308 

LOCOS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 
ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
REMARK 



FEATURES 

source 



BASE COUNT 
ORIGIN 



S79308 913 bp mRNA PLN 30-NOV-1995 

Racl3-21.8 kda GTP-binding protein [Gossypium hirsutum-cotton 
plants, cv. Acala SJ-2, boll fibers, mRNA Partial, 913 nt]. 
S79308 

S79308.1 GI:1087110 

upland cotton boll fibers cv. Acala SJ-2, 
Gossypium hlrsutum 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; 
euphyllophytes; Spermatophyta; Magnoliophyta; eudicotyledons; core 
eudicots; Rosidae; eurosids II; Malvales; Malvaceae; Gossypium, 
1 (bases 1 to 913) 

Delmer,D.P., Pear, J. R., Andrawis,A. and Stalker, D.M, 

Genes encoding small GTP-binding proteins analogous to mammalian 

rac are preferentially expressed in developing cotton fibers 

Mol. Gen. Genet. 248 (1), 43-51 (1995) 

95379748 

GenBank staff at the National Library of Medicine created this 
entry [NCBI gibbsq 170155] from the original journal article. 
This sequence comes from Fig. 1A. 

Location/Qualifiers 

1. .913 

/organisra-'Gossypium hirsutum" 

/dbjcref-"taxon:3635" 

12. ,602 

/gene-"Racl3" 

/note-"21.8 kda GTP-binding protein" 

12. .602 

/gene-"Racl3" 

/note""21.8 kda GTP-binding protein; pea Rhol protein 

homolog/mammalian rac protein homolog; ; This sequence 

comes from Fig, 1A" 

/codon_start-l 

/protein Jd-'AAB35093.1" 

/dbjcref-"GI:1087111" 

/translation-'MSTARFIKCVTVGDGAVGKTCMLISYTSNTFPTDYVPTVFDNFS 
ANVWDGSTVNLGLKDTAGQEDYNRLRPLSYRGADVFLLAFSLISKASYENIYKKWIP 
ELRHYAHNVPWLVGTRLDLRDDKQFLIDHPGATPISTSQGEELKKMIGAVTYIECSS 
KTQQNVKAVFDAAIKVALRPPRPKRKPCKRRTCAFL" 
307 a 169 c 172 g 265 t ; 



Query Match 10 . 0%; Score 304.8; DB 8; Length 913; 

Best Local Similarity 97 . 8%; Pred, No. 8.5e-24; 



Matches 309; Conservative 0; Mismatches 7; Indels 



Gaps 



1806 ctctgaatattggatcattattacagtcaaaaacagttaacaaaagctgttgcagataaa 1865 

II lllllllllllllllllllllllllllllllllllllllllllllllllllllllll 
597 CTTTGAATATTGGATCATTATTACAGTCAAAAACAGTTAACAAAAGCTGTTGCAGATAAA 656 

1866 cactgaatctgctatagtttgtttttggtttacatatgttccacgtgaaactatgaagca 1925 

- IIIIIIIIIIIIHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 
657 CACTGAATCTGCTATAGTTTGTTTTTGGTTTACATATGTTCCACGTGAAACTATGAAGCA 716 

1926 tctctaagaaaacccaaactatcatatcaacccatcgatcaatgaatcgatttcaatttt 1985 

llllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 
717 TCTCTAAGAAAACCCAAACTATCATATCAACCCATCGATCAATGAATCGATTTCAATTTT 776 

1986 cgcagtataagttccttttaatcctttctttttacttcattttataacgaattctatgga 2045 

IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIII 
777 CGCAGTATAAGTTCCTTTTAATCCTTTCTTTTTACTTCATTTTATAACGAATTCTATGGA 836 

2046 taatgttccctacaaacatgtcattacaatgtttaattataaattccattcttctatttt 2105 

I ! 1 1 1 ! 1 1 1 E I M 1 1 1 1 1 1 1 1 1 1 ! 1 1 ! 1 1 i 1 1 1 F I M 1 1 k [ 1 1 1 1 

837 TAATGTTCCCTACAAACATGTCATTACAATGTTTAATTATAAATTCCATTCTTCTATTTT 896 

2106 actaagatattagtaa 2121 

lllll I I I II 
897 ACTAAAAAAAAAAAAA 912 



RESULT 2 
DMU11584/C 

LOCUS DMU11584 4601 bp DNA INV 23-JUL-1994 

DEFINITION Drosophila melanogaster Oregon-R mitochondrial A+T region. 

ACCESSION 011584 

VERSION 011584.1 GI:508826 

KEYWORDS mitochondrial DNA; A+T region; tandem repeats. 

SOURCE fruit fly. 
ORGANISM Mitochondrion Drosophila melanogaster 

Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 
Pterygota; Neoptera; Endopterygota; Diptera; Brachycera;"' 
Muscomorpha; Ephydroidea; Drosophilidae; Drosophila. 

1 (bases 1 to 4601) 

Lewis, D.L., Farr,C.L., Farquhar,A.L. and Kaguni,L,S, 
Sequence, Organization and Evolution of the A+T Region of 
Drosophila melanogaster Mitochondrial DNA 
JOURNAL Mol, Biol. Evol. 11, 523-538 (1994) 
94285822 

2 (bases 1 to 4601) 
Kaguni,L.S. 
Direct Submission 

Submitted (28-JUN-1994) Laurie S. Kaguni Ph.D, Dept. of 
Biochemistry, Michigan State University, East Lansing, Ml, 
48824-1318, USA 

Location/Qualifiers 
1. .4601 

/organism-'Drosophila melanogaster" 
/organel le- "mitochondrion " 
/strain- "Oregon-R" 
/db_xref-"taxon:7227" 
/dev_stage»" embryo" 
1. .4601 
/gene-"mt:ori" 

/note- "mitochondrial origin" 
/allele-"" 

/dbjcref-"FlyBase:FBgn0013687" 
650. .1022 
/gene-"mt:ori" 
/note-"repeat I -A" 
/dbjcref-"FlyBase:FBgn0013687" 
/rpt_type-tandem 
1023. .1360 
/gene- "nit; or i" 
/note-"repeat I-Bl" 
/<3b_xref-"FlyBase:FBgn0013687" 
/rpt_type-tandem 



TITLE 



AUTHORS 

TITLE 

JOURNAL 



FEATURES 
source 



gene 



repeat.unit 



repeatjinit 
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Cancer Genetics at the Roswell Park Cancer Institute in Buffalo, 
NY. The library is named RPCI-98 and was constructed by partial 
EcoRI digestion of Drosophila DNA provided by the BDGP from the 
isogenic strain y2; cn bw sp, the same strain used for the BDGP's 
Pi and EST libraries, A more detailed description of the library 
and 'how to order individual BAG clones, the entire library, or 
filters for hybridization from the BACPAC Resource Center can be 
found at http://bacpac.med. buffalo.edu/drosophila_bac, htm. 
FEATURES Location/Qualifiers 
source 1, .1101 

/organ ism- "Drosophila melanogaster • 

/db_xref-"taxon:7227« 

/clone_ltb-"RPCI-98" 

/clone-'BJOOSHll" 

/note-'end : TET3* 
BASE COUNT 631 a 7 c 28 g 289 t 146 Others 



Query Match 2.0%; Score 111.8; DB 122; Length 1101; 

Best Local Similarity 41.8%; Pred. No. 8,6e-08; 

Matches 270; Conservative 63; Mismatches 312; Indels 1; Gaps 

Qy 4759 ttttaaatgttgtatctaatgttaacatcacttggcttgatttatgttatgttatgtatt 4818 

I II :! II I I I I II III:: I I I I Ml 

Db 1065 TMTTYTMTMTTCTCTTTTTTTTTTTTTTTTTTTTTMWTTTNMMAAATATTHMCACTTTTT 1006 

Qy 4819 ttactttaatgatattgcatgtattgttaatttaacattgcttgatcattatactcttct 4878 

: :| : II II hi : II III: I : : I III r I :| 
Db 1005 CATMCYTTCWTATTTTTCMTMHTTTTTTAMMATTMAMAMMTTATYMTCTTACHATTYTTA 946 

Qy 4879 actattaattataaatggcactgttttgtttaaactttttacaagttaagacatgtataa 4938 

II :::ll :: :l IIIIMII II I : Ml: II 

Db 945 ACMYCMMMYTAMCMMMMCATTCMCAWTTTTTAWACTTAAAAAACATAWTTAAATSATTAT 886 

Qy 4939 atatatgacaatataattacaggttttagttcaatgttagctatcttagtatgttattga 4998 

I II I IMI: I III I I:: l|: I I II III II II 
Db 885 TAAAATTTTTAAATAAWAAATAATTTAAAAAAAWWTTTWTTTTTTTTTTTATTTTTTTAT 826 

Qy 4999 tgatcttaattacatttaaacaaattccacttaaaattttaataaataataacaaataat 5058 

Db 825 TTOTTTTTTTTTTHTTTTTAATTAAAAAAM^ 766 

Qy 5059 tattgtaatataatacattaaatgcaacaaaaaatgaaataaataaaataaaatagcaaa 5118 

III: III II : : :|| III: II : I llhllll I II 
Db 765 TATWTAAmTTTAPAraTWTTTTAAAAWTAATAAAWTTTTAAAWAAAAAAAWTAA 706 

Qy 5119 taattgttataatattgtaatataatatgtaccatattcttaactgaaatagggtctaac 5178 

II : II III III Ml MM II : |:| :|| I |: I 
Db 705 AAAAWTTTTTAAAATT-TWATTTWATWWATAAAWAAWTTWTATWTTTTTTWTATTAATTA 647 

Qy 5179 ctataatccctaaaatttcagtttaaatatttttatacctaccatattattagaactctt 5238 

:| II :|: Ml I :| I ||::| II I : |: : : I I 
Db 646 AWAAAAAAAAWAOTWTTTAAAAKTTTTTTTTMWTTAWATTAAAAWTAAAWAWTWTTTTAT 587 

Qy 5239 tttaaatatattaaaattttaattataccaatttaattaaactattaattatcttaacta 5298 

I I MUM II II I I II II III I M II :! I 
Db 586 TAAATTTWTWTTAAAMTTAAAATTAAWAAAAAAAAAAAAAAAAAAAWTAATTAAWAAAA 527 

Qy 5299 aaatctaaaattttatttaacctattaataaattcctaattatcttatctaatttaaaac 5358 

Db 526 AAOTAAATTATATTTTT 467 

Qy 5359 tctaattatcctaatttaatttaaattcttaattatcttaatttgt 5404 

I I II I I III III II II II I II III I 
Db 466 HTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT 421 



Search completed: September 2, 2000, 22:57:50 
Job time: 19286 sec 



Tue Sep 5 07:23:03 2000 



us-08-984-099-7.rst 



Page 10 



ENCE 
AUTHORS 
TITLE 
JOURNAL 



FEATURES 
source 



BASE COUNT 
ORIGIN 



1 (bases 1 to 1201) 
Genoscope. 
Direct Submission 

Submitted (23 -JUL- 1999) Genoscope ■ Centre National de Sequencage : 
BP 191 91006 EVRY cedex ■ FRANCE (E-mail : seqrefSgenoscope.cns.fr 
• web : www.genoscope.cns.fr) 

Determination of this BAC-end sequence was carried out as part of a 
collaboration with the European Drosophila Genome Project (EDGP) • 
http://www.edgp.ebi.ac.uk ■. This Drosophila melanogaster BAC 
library (Dros BAC) was made by Alain Billaud at CEPH (Centre 
d' Etude du Polymorphisme Humain) with funding provided by a MRC 
project grant. The DNA was prepared from embryos by Alain Bucheton 
and Genevieve Payan. It has been constructed in the vector 
pBeloBACll. 

Location/Qualifiers 

1. .1201 

/organism-"Drosophila melanogaster" 
/plasmid-'pBeloBACH" 
/dbjcref-"taxon:7227" 
/clone_lib-"DrosBAC" 
/clone="BACNl5M24" 
/note* "end : T7" ' 
323 a 87 c 79 g 551 t 161 others 



Query Match 2.1%; Score 115; DB 123; Length 1201; 

Best Local Similarity 38.94; Pred. No. 2,8e-08; 

Matches 249; Conservative 90; Mismatches 298; Indels 3; Gaps 

Qy 4685 gtgaatctctttgcatacatagaaattctaaatggttatagtttatgttatagtgtatgt 4744 

III : I : :|:| :| :: :| ::::: II I III I I I :|:| I : 
Db 564 GTGTWATTKKMMVCWTSTMTNHHMTHTTWMMMHNMMTAGATTTTKTTTATTTKTKTTTTK 623 

Qy 4745 tgtagtgaaattaattttaaatgttgtatctaatgttaacatcacttggcttgatttatg 4804 

III I Mil : : : I : |::| I :::: |:: ::| 
Db 624 TTTGTTTTTTTANGCTTTARKKTKGTKTTTRBTTKRTKTTTTGKKKKRKKTRKTKKTTGT 683 

Qy 4805 ttatgttatgtattttactttaatgatattgcatgtattgttaatttaacattgcttgat 4864 

: I II I llll III I I II II I ::|llh III II II 
Db 684 DAAAAAAATNTTTTTTTTTTTTTTTTTTTTTTTTTTTTATDWAATTWTTTTTTGTTTTAT 743 

Qy 4865 cattatactcttctactattaattataaatggcactgttttgtttaaactttttacaagt 4924 

111:1 II:: I I ::: I |:|| llll I Ml 
Db 744 GNATTTAAWTTATATTTAWWATTTTTTWWWTW- ■ •TTTWTTTTTTATTTTNAAAAARAAT 800 

Qy 4925 taagacatgtataaatatatgacaatataattacaggttttagttcaatgttagctatct 4984 

Db 801 TAAATTTTTTTTTTTTWAAAMTWTTATAAW 860 

Qy 4985 tagtatgttattgatgatcttaattacatttaaacaaattccacttaaaattttaataaa 5044 

: I II I II I :l II I :l II : 1 1 1 i : I : I : I I 
Db 861 WTTTTATTTTATAATTTTTAWATTTTTAAWTTTTTTTTTTAAWRAAAAAAWTWTWTTTAT 920 

Qy 5045 taataacaaataattattgtaatataatacattaaatgcaacaaaaaatgaaataaataa 5104 

Hill III Ml : Ml IM :: HIM llll I 
Db 921 AWATAAAAAAAAWTTTWATATTWTTTTTTTTTTWAAAAAWWATWAAAAATTTTTAAAAAT 980 

Qy 5105 aataaaatagcaaataattgttataatattgtaatataatatgtaccatattcttaactg 5164 

hi I : II II I I I I II I : I :M llll II 
Db 981 TTTWATAATTAWTTTATTTTTAAAATTTTTTTTTWTTTWTWTATAAAAAAAAAAAAAAAA 1040 

Qy 5165 aaatagggtctaacctataatccctaaaatttcagtttaaatatttttatacctaccata 5224 

:| I: :M I II :|l|::: III I 'Mil: || =11 
Db 1041 WATTWTATANWAAAWAAATATTNTADAAAWWWTTTTTTTTTTTTTTTTWNTTATATTWTA 1100 

Qy 5225 ttattagaactctttttaaatatattaaaattttaattataccaatttaattaaactatt 5284 

II : : llhi : I h: Milt: | I III : I llll 
Db 1101 TTTAATATWWWITTWTTWTTTATATWWTATTTTWTTTTTTATANTTTTTNWTATATATT 1160 

Qy 5285 aattatcttaactaaaatctaaaattttatttaacctatt 5324 

: :||| I I I II II I hhl |: |:|: 
Db 1161 WTWTATTTATTTTTATATTTATATATWTWTATWTWATWTW 1200 



RESULT 13 
CNS003BD/C 
LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 
ORGANISM 



AUTHORS 

TITLE 

JOURNAL 



FEATURES 

source 



BASE COUNT 
ORIGIN 



CNS003BD 1101 bp DNA GSS 03-JUN-1999 

Drosophila melanogaster genome survey sequence TET3 end of BAC # 
BACR08K08 of RPCl-98 library from Drosophila melanogaster (fruit 
fly), genomic survey sequence. 
AL064091 

AL064091.1 61:4941847 
GSS. 

fruit fly. 

Drosophila melanogaster 

Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 
Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; 
Muscomorpha; Ephydroidea; Drosophilidae; Drosophila. 
1 (bases 1 to 1101) 
Genoscope. 
Direct Submission 

Submitted (02- JHN-1999) Genoscope - Centre National de Sequencage : 
BP 191 91006 EVRY cedex • FRANCE (E-mail : seqref ^genoscope. cns.fr 
- Web : www.genoscope.cns.fr) 

Determination of this BAC-end sequence was carried out as part of a 
collaboration with the Berkeley Drosophila Genome Project (BDGP). 
The BDGP is constructing a physical map of the Drosophila 
melanogaster genome using these BACs. For further information 
please see http://www.fruitfly.org The BDGP Drosophila 
melanogaster BAC library was prepared by Razutoyo Osoegawa and 
Aaron Mammoser in Pieter de Jong's laboratory in the Department of 
Cancer Genetics at the Roswell Park Cancer institute in Buffalo, 
NY. The library is named RPCI-98 and was constructed by partial 
EcoRI digestion of Drosophila DNA provided by the BDGP from the 
isogenic strain y2; cn bw sp, the same strain used for the BDGP's 
PI and EST libraries . A more detailed description of the library 
and how to order individual BAC clones, the entire library, or 
filters for hybridization from the BACPAC Resource Center can be 
found at http://bacpac.med.buffalo.edu/drosophila_bachtni., 

Location/Qualifiers 

1. .1101 

/organism- "Drosophila melanogaster" 
/db_xref-"taxon:7227" 
/clone.lib-"RPCI-98" 
/clone= n BACR08K08 n 
/note- "end : TET3" 
395 a 120 c 103 g 334 t 149 others 



Query Match 2.1*; Score 114.2; DB 122; Length 1101; 

Best Local Similarity 42.1%; Pred. No. 3 . 8e-08; 

Matches 262; Conservative 73; Mismatches 281; Indels 7; Gaps 

Qy 3309 aaaatatagtaatataaagtgtaattaactttaaattacaagcataatattaaattttga 3368 

llll II: :: I :h ::: hi : II :l I II :| I:: 
Db 1101 AAAAAATRWGRWGNGGATAWGWTWKDTKWTWTTTWATAATNTADTTKTTTTTTKWTATRW 1042 

Qy 3369 atcaattaatttttatttctattattttaattaatttagtctattttttcaaaataaaat 3428 

II hi Ih hhl :!::!:!: III : II I llllll Mill 
Db 1041 ATAKAWTTTTTWGTRTWTTTRTWWTWTWNATTTWATTTTATTTTTTTTTTWWTATTATAA 982 

Qy 3429 ttaaatctaaataaaaataatttttccttaatgttgaaacaactcatgttatacttcaaa 3488 

1:1 : II I III II Ml: |:::| h hi MM : h II I 
Db 981 •TWATWATATAAAAATATTWTWTWTAATWRWTTTWTARAKAAAWAAWTAWTTWATTTTTA 923 

Qy 3489 attataagtattatatttaccttgatgatttatttattagtatattaattctgattataa 3548 

M I llll :||l II : llll: ::|| :|:|| :lll I llll I 
Db 922 TWTTAATTTTTTTTTWTTAWTTTAWTTATTTWAAAWWTATWAWATAWATTATTTTTATTA 863 



Db 



3549 ttatggtgggatacaatcgctttccactaaatattttaactatgatttataaatttattt 3608 

Mil: I I : lllll III I I M: hi 
862 TTTTTTTWTTTTTATTWTANAATWTTTAATKAWATTTAAWTATWAAATTTAAAWAAAWTA 803 



Qy 3609 caacatcgtatatttacttattaatacataatttatcataattttatggaaattgagacc 3668 



Tue Sep 5 07; 23; 03 2000 



us-08-984-099-7.rst 



Page 8 



please see http://www.fruitfly.org The BDGP Drosophila 
melanogaster BAC library was prepared by Kazutoyo Osoegawa and 
Aaron Mammoser in Pieter de Jong's laboratory in the Department of 
Cancer Genetics at the Roswell Park Cancer Institute in Buffalo, 
NY. The library is named RPCI-98 and was constructed by partial 
EcoRl digestion of Drosophila DNA provided by the BDGP from the 
isogenic strain y2; cn bw sp, the same strain used for the BDGP's 
PI and EST libraries . A more detailed description of the library 
and how to order individual BAC clones, the entire library, or 
filters for hybridization from the BACPAC Resource Center can be 
found at http://bacpac.med.buffalo.edu/drosophila_bac.htm. 
FEATURES Location/Qualif iers 

source 1. .1101 

/organ ism- • Drosophila melanogas ter " 

/db_xref-"taxon:7227" . 

/clone_lib-"RPCI-98" 

/clone-"BACR29P01" 

/note- "end : TET3" 
BASE C0DNT 366 a 66 c 104 g 351 t 214 others 
ORIGIN 



Query Match 2.1%; Score 115.6; DB 122; Length 1101; 

Best Local Similarity 40.6%; Pred. No. 2.3e-08; 

Matches 231; Conservative 81; Mismatches 255; Indels 2; Gaps 1; 

Qy 4703 atagaaattctaaatggttatagtttatgttatagtgtatgttgtagtgaaattaatttt 4762 

I: ::| II II |::| HIM I I I :| I I II ::| III 
Db 969 AYATTTWWTATACATAATWWTTTTTTATATACAATTTWAAAATAAAAAWTAACWWAATTT 910 

Qy 4763 aaatgttgtatctaatgttaacatcacttggcttgatttatgttatgttatgtattttac 4822 

1:1 : I : I I :: II I I : :|| :: L : II :|| 
Db 909 AWAMWCATTWTTTTAAWWTAATTTWATTACAWTWTTAWWTAAAaAAATWTTAAAWTAA 850 

Qy 4823 tttaatgatattgcatgtattgttaatttaacattgcttgatcattatactcttctacta 4882 

: II ::lll I I : hi h I h: I II 111:: : I I II 
Db 849 AWAAAAARWATTAWAATTTAYATWATATWTAAAWWTATAWATTATTMW--WAATWTTATA 792 

Qy 4883 ttaattataaatggcactgttttgtttaaactttttacaagttaagacatgtataaatat 4942 

: I II I : I : :lh I: : I I III I : llll I I I III 
Db 791 MNATTTTTTTTWTAWAAWTWTTWATWAWTAATTTTAAWTWWTTAAATAAAAAAAATTTAT 732 

Qy 4943 atgacaatataattacaggttttagttcaatgttagctatcttagtatgttattgatgat 5002 

I hi llh I lllh II :l h: I I : II :hll I h 
Db 731 TTTTATTTWTTATTWAAAWTTTTWTTTTWAATTWWYTTTAATAWTTAAAWTWTTAAAWAW 672 

Qy 5003 cttaattacatttaaacaaattccacttaaaattttaataaataataacaaataattatt 5062 

I hill 111:11 I :hlhl I III hh: h: II II 
Db 671 WTAAWTTAAATAMTWTATTTAAAATWTWAAWTATAAATTTAWADWTTATAWWTTTTTTT 612 

Qy 5063 gtaatataatacattaaatgcaacaaaaaatgaaataaataaaataaaatagcaaataat 5122 

: : :|||| II Nil II I Mil : I III :::| II I :| ::: I 
Db 611 IfflCWTTWMTATATAAMTWAAAAATTAMTTWITTTTATATl^ATAAAAAMATWWWTT 552 

Qy 5123 tgttataatattgtaatataatatgtaccatattcttaactgaaatagggtctaacctat 5182 

llllll:: || | :|: II |||| I II I I I 

Db 551 ATTTATAAWWAATTATTTAWAWTTGATTTTTATTTATTTWTTTTTTAAAWTTTTATTATA 492 . 

Qy 5183 aatccctaaaatttcagtttaaatatttttatacctaccatattattagaactcttttta 5242 

: Hh I II I llllllll II h hllll I !::'!!: 
Db 491 TWATTTATTAATWTTAWTTATYTTTTTTTTATATCTTTCMAAWTATTTMTCCCCYYTTTW 432 

Qy 5243 aatatattaaaattttaattataccaatt 5271 

II II llll : I I I II 
Db 431 TTMATGTTTTTTTTTTTHCTCTTNCNTTT 403 



RESULT 10 
B11102/C 

LOCUS B11102 1187 bp DNA GSS 14-MAY-1997. 

DEFINITION F19C22-T7 IGF Arabidopsis thaliana genomic clone F19C22, 

genomic survey sequence. 
ACCESSION B11102 



BU102.1 GI:2092386 
KEYWORDS GSS. 
SOURCE thale cress. 
ORGANISM Arabidopsis thaliana 

Eukaryota; Viridiplantae; Embryophyta; Tracheophyta; Spermatophyta; 
Magnoliophyta; eudicotyledons; Rosidae; eurosids II; Brassicales; 
Brassicaceae; Arabidopsis. 
REFERENCE 1 (bases 1 to 1187) 
AUTHORS Feng, J,, Dewar,K., Buehler,E,, Kim,C, Li,Y., Shinn,P. , Sun,H. and 
Ecker,J. 

TITLE BAC End Sequences at ATGC 
JOURNAL Unpublished (1997) 
COMMENT On Sep 10, 1998 this sequence version replaced gi:3556525. 

Other.GSSs: F19C22-Sp6 

Contact: Ecker J. 

Arabidopsis Thaliana Genome Center 
University of Pennsylvania 

Dept. of Biology, University of Pennsylvania, Philadelphia, PA 
19104 

Tel: 215-898-9384 
Fax: 215-898-8780 

Email: jeckerSatgenome.bio.upenn.edu 
Seq primer: T7 
Class: BAC ends 

High quality sequence start: 72 
High quality sequence stop: 353. 
FEATURES Location/Qualifiers 
source 1. .1187 

/organism-'Arabidopsis thaliana" 

/strain- "Columbia" V" 

/db_xref-"taxon:3702" 

/clone-*F19C22" 

/clone Jib-' IGF' 

/sex-"hermaphrodite" 

/note- "Vector: BeloBACII; Site_l: EcoRI; Site_2: .EcoRI; 

Produced by Thomas Altmann" 
BASE COUNT 385 a 51 c 60 g 594 t 97 others 
ORIGIN 



Query Match 2.1%; Score 115.2; DB 120; Length 1187; 

Best Local Similarity 47.0*; Pred. No. 2.7e-08; 

Conservative 0; Mismatches 402; Indels 6; Gaps 2; 



I llllll I llll II I lllll II I III llllll I 



II I II III II I II II III II llllll II III 



I lllll I llll II II I III III I I II III III 



Matches 


Qy 


3208 


Db 


1185 


Qy 


3268 


Db 


1125 


Qy 


3328 


Db 


1065 


Qy 


3388 


Db 


1005 


Qy 


3448 


Db 


945 


Qy 


3508 


Db 


885 


Qy 


3568 


Db 


825 



III llll II II I I 



llll llll 



III I llll llllll 



III I I I III II II I 



I II II II II llll II 



I I II 



III Mill I II lllll II llllll II I II III 
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BASE COUNT 
ORIGIN 



BACPAC Resources ( http : //bacpac .med . buffalo.edu/ordering) or from 
Research Genet cs (info3resgen.com). BAC end search page: 
http : //www . tigr . org/tdb/humgen/bac_end_search/bac_end_search . html . 
Seq primer: SP6 
Class: BAC ends. 

Location/Qualifiers 

1. .718 

/organisiti-'Homo sapiens" 
/db_xref-"GDB: 7573443" 
/dbjcref-"taxon:9606" 
/Clone- "RPCHH92E4" 
/clone_lib"'RPCI-H" 
/sex- "Male" 

/eel l.type- " Lymphocy tes " 

/note- "Vector: pBACe3.6; SiteJ: EcoRI; SiteJ: EcoRI; 
RPCIU Human Male BAC Library" 
310 a 39 c 35 g 334 t 



Query Match 2.1*; Score 117.2; DB 102; Length 718; 

Best Local Similarity 48.84; Pred. No, l,4e-08; 

Matches 317; Conservative 0; Mismatches 333; Indels 0; Gaps 

Qy 4744 ttgtagtgaaattaattttaaatgttgtatctaatgttaacatcacttggcttgatttat 4803 

Nil I II II II II I III II I III III II I I II 
Db 17 TTGTGCTTTTCTTTGTTGGAAGGTTTTTGACTACTGATTGAATCTCTTTACTAGCTATAA 76 

Qy 4804 gttatgttatgtattttactttaatgatattgcatgtattgttaatttaacattgcttga 4863 

II I II III I II II III III I I II I I II II 
Db 77 GTAI1TTCAGACATTCTCTTTCCTTGTGATTCCATTTTGTTATATTGCATGTTTCTATGT 136 

Qy 4864 tcattatactcttctactattaattataaatggcactgttttgtttaaactttttacaag 4923 

llllll I III I II I III II I I II I 
Db 137 ATATTAIAIATATTATTTATATGTATATTATATATATTATTTACATATTATATATATTAT 196 

Qy 4924 ttaagacatgtataaatatatgacaatataattacaggttttagttcaatgttagctatc 4983 

I I I II I Mill! I II I I I I I III 

Db 197 ATTATATATATTATAATATATATAATATACTATATAATATACATATATGTTAATAATATA 256 

Qy 4984 ttagtatgttattgatgatcttaattacatttaaacaaattccacttaaaattttaataa 5043 

II I I II I I II I II I I II I I I II III 

Db 257 TAATATATGTTAATAATATATATAATATACATAGAAATATATAAATATATATAAATATAT 316 

Qy 5044 ataataacaaataattattgtaatataatacattaaatgcaacaaaaaatgaaataaata 5103 

llllll II II III llllll II I I II II II I II 
Db 317 ATAATATAAATATATATTTGAAATATATATTTCAAATATATATATAATATATAAATATTA 376 

Qy 5104 aaataaaatagcaaataattgttataatattgtaatataatatgtaccatattcttaact 5163 

i mi iii i ii ii i iii i mi mi n in 1 1 i • 

Db 377 ATATAATATAATATATTATATTATATATAATAATATATTATATATATTATAACATAATAT 436 

Qy 5164 gaaatagggtctaacctataatccctaaaatttcagtttaaatatttttatacctaccat 5223 

I III I III llllll II I III I llll I II 

Db 437 AATATATAATATAATATATAATATTATGTTATTATATATAATATATCTTATTACATATAT 496 
Qy 5224 attattagaactctttttaaatatattaaaattttaattataccaatttaattaaactat 5283 

i iii i i 1 1 1 mi ii iii i i iiii i iii iii iii 

Db 497 AATATATTATATTATATAATATATTATATAATATATAATATAATATATTATATAATATAT 556 

Qy 5284 taattatcttaactaaaatctaaaattttatttaacctattaataaattcctaattatct 5343 

II III II II II I I II I I II II III III I III I 

Db 557 TATATATAATATATATTATATTATATATATTATATATAATATATATATTATAATATATAT 616 

Qy 5344 tatctaatttaaaactctaattatcctaatttaatttaaattcttaatta 5393 

III III I llll II I II III llll 

Db 617 ATTATATATAATATATATAATATAATATATATTATATATAAAATATATAA 666 



RESULT 7 
B11102 

LOCUS B11102 1187 bp DNA GSS 14-MAY-1997 

DEFINITION F19C22-T7 IGF Arabidopsis thaliana genomic clone F19C22, 



TITLE 



COMMENT 



genomic survey sequence. 
ACCESSION BU102 



B11102.1 GI:2092386 
KEYWORDS GSS. 

SOURCE thale cress. 
ORGANISM Arabidopsis thaliana 

Eukaryota; Viridiplantae; Embryophyta; Tracheophyta; Spermatophyta; 
Magnoliophyta; eudicotyledons; Rosidae; eurosids II; Brassicales; 
Brassicaceae; Arabidopsis, 
REFERENCE 1 (bases 1 to 1187) 
AUTHORS Feng, J., DevarX, Buehler , E . , Kim,C, Li,Y. , Shinn,P., Sun,H, and 
Ecker,J. 

BAC End Sequences at ATGC 
Unpublished (1997) 

On Sep 10, 1998 this sequence version replaced gi: 3556525 . 
Other.GSSs: F19C22-sp6 
Contact: Ecker J. 

Arabidopsis Thaliana Genome Center 
University of Pennsylvania 

Dept. of Biology, University of Pennsylvania, Philadelphia, PA 
19104 

Tel: 215*898-9384 
Fax: 215-898-8780 

Email: jecker@atgenome.bio.upenn.edu 
Seq primer: T7 
Class: BAC ends 

High quality sequence start: 72 
High quality sequence stop: 353. 
Location/Qualifiers 
1. .1187 

/organism-'Arabidopsis thaliana" 
/strain- "Columbia" 
/db_xref-"taxon:3702" 
/clone-"F19C22" 
/clone_lib-"IGF" 
/sex- " hermaphrodite" 

/note- "Vector: BeloBACIl; SiteJ: EcoRI; SiteJ: EcoRI; 

Produced by Thomas Altmann" 
BASE COUNT 385 a 51 c 60 g 594 t 97 others 
ORIGIN it 



FEATURES 
source 



Query Match 2.11; 
Best Local Similarity 43.6%; 
Matches 365; Conservative 



Score 117; DB 120; 
Pred. No, l,4e-08; 
); Mismatches 470; 



Length 1187; 

Indels 3; Gaps 2; 



3129 agaataatgttaaaatgaaattaaaataaggtggcctggtcacacacacaaaaaaaaact 3188 

II II II llllll I 

328 AAANNNNNNNNNNNNNNMNNNNMM1MNNNNNNNOT 387 

3189 aatgttggttggttgaattttata ttacggaatgtaatattatattttaaaataaaatta 3248 

I II I Ml I I II I IN II 

388 ATNNNNTNTTTTTNNNNNTNTNNNTNNNNTTTATTTTTTTTTTTTTATNNTTNTNNTTTN 447 

3249 tgttatttagattcttaatattttggagcattccatactataatttcgtaacataatatt 3308 

Ml I I III I III III I III llll I llllll 
448 NTNNTTNTAAAATTTTATTTTTTNNTTATATTTTANNTTATTATTTTNATATTAAATATT 507 

3309 aaaatatagtaatataaagtgtaattaactttaaattacaagcataatattaaattttga 3368 

II I II II I I II II II I III llll II 

508 AATGTTTATTAATTTTNTAATTNATATTATTATTTTTTAATATTATTTATAAAATATTTT 567 

3369 atcaattaatttttatttctattattttaattaatttagtctattttttcaaaataaaat 3428 

I II I llll I III II I I Ml II I I I 

568 AATGTTTTTAATAAATTTATTTTAAATTTAATNTAAAGATATAAATTATATATTTTTTTA 627 

3429 ttaaatctaaataaaaataatttttccttaatgttgaaacaactcatgttatacttcaaa 3488 

I I III llllll llllll Mill II I I Ml II II 

628 ATTTTTATAATTAAAAAAAATTTTTATTNTAT-TTTAATTTTTTTTTTATTTATTTNANA 686 

3489 attataagtattatatttaccttgatgatttatttattagtatattaa ttctgattataa 3548 

II III II I Mil I I llllll II II I I I llll I 
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FEATURES 

source 



BASE COUNT 
ORIGIN 



found at http://bacpac.raed.buffalo.edu/drosophilaJac.htni. 
Location/Qualifiers 
1. ,1101 

/organism- "Drosophila melanogaster" 

/dbjcref-*taxon:7227" 

/clone_lib«"RPCI-98" 



201 a 



/clone- "BACR08K10" 
/note- "end : TET3" 
64 c 131 g 



202 t 503 others 



Query Match 2.3%; Score 126.2; DB 122; Length 1101; 

Best Local Similarity 21.5%; Pred. No. 6.2e-10; 

Matches 151; Conservative 310; Mismatches 238; Indels 4; Gaps 

Qy 3299 acataatattaaaatatagtaatataaagtgtaattaactttaaattacaagcataatat 3358 

I h!::::: I I :|| : II! I " |: : III I :| hll h 
Db 398 ATAWAWWWWWTTTTTTTTAWAMWAAMTAATTWAAWAWAAAAAATTWWAAMWAAAAW 457 

Qy 3359 taaattttgaatcaattaatttttatttctattattttaattaatttagtctattttttc 3418 

: l::ll hi II II III: I II hll hi : I : I :::|| 
Db 458 AWTAWVraTTAWTWAAAAAAAAAAMTTWTTTTTTTWTTTAWTTWATAWWTTWWWTTAAAW 517 

Qy 3419 aaaataaaatttaaatctaaataaaaataatttttccttaatgttgaaacaactcatgtt 3478 

Mil llll :Hh hlllll :|::j::M: hlh I h :|| :::| II 
Db 518 AAAAAAAAAAAWAMWAAAWATAAATWTWWWWTTYTTMAAWATAAAMCMAAWYYHTYTT 577 

Qy 3479 at-acttcaaaattataagtattatatttaccttgatgatttatttattagtatattaat 3537 

I :::h: |:: : hhhll::::: : :::::|:::| : ::::|| 
Db 578 YTYHYYTYWTYTMTWHYHTMYTHAWAHTTWYHWYHTYAMWHWMTWMMWWHWTTYTA 637 

Qy 3538 tctgattataattatggtgggatacaatcgctttccactaaatattttaactatgattta 3597 

:: :| :: : |:| |:: :|:: :|:::| ::::::: I : : |: 
Db 638 AYYYYYTCMYYYHYMHWHHAHAHAMWWrTHTWWTHAYHWATYHYYYYMYCAMMCMCTHT 697 

Qy 3598 taaatttatttcaacatcgtatatttacttattaatacataatttatcataattttatgg 3657 

::: ::: |:: : :::|:::: :::: ::: || |:: ::::: :: ::| 
Db 698 CHHCYYYYHHYTAHHTHTHHWYAHYYMWYYMWAYYWMYCTACTYHYHHHHHYHWAYHTTW 757 

Qy 3658 aaattgagaccaagaaacattaagagaacaaattctataacaaagacaatttagaaaaaa 3717 

h : : : ::| || | :|| | :: ::||::::: :: : : :: : :: 
Db 758 YAWAHAMWMWHHAHYAAAAAWAAWATTHHYHHTTHYMHHTYMYHYYMYTCCYMCTYHCWH 817 

Qy 3718 atgtacttttaggtaattttaagtactcttaaccaaacacaaaaattcaaatcaaatgaa 3777 

: ||:| :|: ::::|:| :: :: : :: ::|: :::::: :| I :|| : 
Db 818 YYHTAYTCWTWTHHWMWTWTHWYHKTWWHHTTTHWAWWHTHTWCWWWWHATTWTWATHCW 877 

Qy 3778 ctaaataagataatataacatacggaacatcttacttgtaatcttacattcccataattt 3837 

Db 878 ACMTMHWHHMMHHI^CiMHHTHMCMCHHMCTCHHHHTMYHMTCHMraMHWHW 937 



3838 tattatgaaaaataatcttatattactcgaactaaatgttgtcacaaattatta—tct 3894 

938 HHW^TWMTTMTTMMMMCCMMHHTC 997 

3895 aaataaagaaaaacacttaatttttataacattttttcatatatttgaaagattatattt 3954 

: h: : | : | | :| || |:::::::: :: :::::::: |: I I :::|: 

998 HYCTWHTYmYWWAWTAHAMTTATWWWW™^ 1057 

3955 tgtatatttacgtaaaaatatttgacatagattgagcaccttc 3997 

1058 YHTHCTWYYHHTYHMWWAWWMAWHWHHMYAHYHWAHHCWYYTM 1100 



RESULT 4 
CNS0167M/C 

LOCUS CNS0167M 1201 bp DNA GSS 26-JUL-1999 

DEFINITION Drosophila melanogaster genome survey sequence T7 end of BAC 

BACN15M24 of DrosBAC library from Drosophila melanogaster (fruit 

fly), genomic survey sequence. 

ACCESSION AL106396 
VERSION AL106396.1 GI: 5621701 
7 ""• W 



KEYWORDS 
SOURCE 
ORGANISM 



AUTHORS 

TITLE 

JOURNAL 



fruit fly. 

Drosophila melanogaster 

Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 
Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; 
Muscomorpha; Ephydroidea; Drosophilidae; Drosophila. 
1 (bases 1 to 1201) 



FEATURES 
source 



BASE COUNT 
ORIGIN 



Direct Submission 

Submitted (23-JUL-1999) Genpscope • Centre National de Sequencage 
BP 191 91006 EVRY cedex - FRANCE (E-mail : seqrefSgenoscope.cns.fr 
- web : www.genoscope.cns.fr) 

Determination of this BAC-end sequence was carried out as part of 
collaboration with the European Drosophila Genome Project (EDGP) ■ 
http://www.edgp.ebi.ac.uk -. This Drosophila melanogaster BAC 
library (Dros BAC) was made by Alain Billaud at CEPH (Centre 
d' Etude du Polymorphisme Humain) with funding provided by a MRC 
project grant. The DNA was prepared from embryos by Alain Bucheton 
and Genevieve Payan. It has been constructed in the vector 
pBeloBACll. 

Location/Qualifiers 

1. .1201 

/organism="Drosophila melanogaster" 
/plasmid- " pBeloBACll * 
/db_xref-"taxon:7227" 
/clone_lib-"DrosBAC" 
/Clone«"BACN15M24" 
/note- "end : T7" 
323 a 87 c 79 g 551 t 161 others : 



Query Match 2,2%; Score 121.6; DB 123; Length 1201; 

Best Local Similarity 37,4%; Pred, No. 3e-Q9; 

Matches 280; Conservative 101; Mismatches 368; Indels 0; Gaps 0: 

Qy 3219 aatgtaatattatattttaaaataaaattatgttatttagattcttaatattttggagca 3278 

:ll :h llll I III I II II: : II II h III II 
Db 1193 WATAWAWATATATAAATATAAAAATAAATAWAWAATATATAWNAAAAANTATAAAAAAWA 1134 

Qy 3279 ttccatactataatttcgtaacataatattaaaatatagtaatataaagtgtaattaact 3338 

h : 1 1 1 1 1 : 1 1 ::: 1 1 1 1 1 1 1 1 : . I I Hll II II 
Db 1133 AAATAMTATAMMWAAAW1#JATATTAAATAWAATATAANWAAAAAAAAAAAAAAA 1074 

Qy 3339 ttaaattacaagcataatattaaattttgaatcaattaatttttatttctattattttaa 3398 

::: ||: I hll : I I III II lllll III = h :! 

Db 1073 AWWWTTTHTANAATATTTWTTTWNTATAWAATWTTTTTTTTTTTTTTTTATAWAWAAAWA 1014 

Qy 3399 ttaatttagtctattttttcaaaataaaatttaaatctaaataaaaataatttttcctta 3458 

II I I II I h I I : II I I II II I : h:l lh 
Db 1013 AAAAAAAATTTTAAAAATAAAWTAATTATWAAAATTTTTAAAAATTTTTWATWWTTTTTW 954 

Qy 3459 atgttgaaacaactcatgttatacttcaaaattataagtattatatttaccttgatgatt 3518 

I III :| I : :l II I :|llll :h :l III II I 
Db 953 AAAAAAAAAAWAATATWAAAWTTTTTTTTATWTATAAAWAWAWTTTTTTYWTTAAAAAAA 894 

Qy 3519 tatttattagtatattaattctgattataattatggtgggatacaatcgctttccactaa 3578 

I : I I ' III I llll : llll h 
Db 893 AAAWTTAAAAATWTAAAAATTATAAAATAAAAWAAAAAAAAAAAAAAAAAWWAAAAAWTT 834 

Qy 3579 atattttaactatgatttataaatttatttcaacatcgtatatttacttattaatacata 3638 

:lll III I h I III I I I h I I Ml Ihl hi 

Db 833 TWATTATAAWATTTTTWAAAAAAAAAAAATTTAATTYTTTTTNAAAATAAAAAAWAAAWA 774 

Qy 3639 atttatcataattttatggaaattgagaccaagaaacattaagagaacaaattctataac 3698 

::: I I ::l II Ihll | I! MM || : I ::|| I II 
Db 773 WWWAAAAAATffWTAMTATMWTTAAATNCATAAAACAAAAAAWAATTWHATAAAAAAAA 714 

Qy 3699 aaagacaatttagaaaaaaatgtacttttaggtaattttaagtactcttaaccaaacaca 3758 

III I II I Mill I llll: h h ::: : II :\-- I- 
Db 713 AAAAAAAAAAAAAAAAAAAANATTTTTTTHACAAMMAMMAMMYMMMMCAAAAMAMMAAVM 654 

. Qy . 3759 aaaattcaaatcaaatgaactaaataagataatataacatacggaacatcttacttgtaa 3818 
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gb_gssl3:* 
gb_gssl4 :* 
gb_gssl5:* 
gb_gssl6:« 
gb.gssU:' 
gb_gssl8:* 
gb_gssl9:* 
em_gssl3 :* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution, 



SUMMARIES 



No. Score 


Match 


Length 


DB ID 


Description 


1 145,6 


2,6 


1101 


122 


CNS00EVL 


AL069706 Drosophil 


2 142 


2.6 


1101 


122 


CNS00EVL 


AL069706 Drosophil 


3 126.2 


2.3 


1101 


122 


CNS0039G 


AL063921 Drosophil 


4 121.6 


2.2 


1201 


123 


CNS0167M 


AL106396 Drosophil 


5 119.8 


2*2 


1101 


122 


CNS00EO7 


AL069440 Drosophil 


6 117.2 


2.1 


718 


102 


AQ416310 


AQ416310 RPCI-11-1 


7 117 


2j 


1187 


120 


B11102 


BU102 F19C22-T7 I 


8 116.8 


2.1 


1101 


122 


CNS00RAE 


AL077628 Drosophil 


9 115.6 


2.1 


1101 


122 


CNS00EO7 


. AL069440 Drosophil 


10 115,2 


2,1 


1187 


120 


B11102 


BH102 F19C22-T7 I 


11 115 


2,1 


1101 


122 


CNS0021J 


AL061936 Drosophil 


12 115 


2.1 


1201 


123 


CNS0167M 


AL106396 Drosophil 


13 114.2 


2!l 


1101 


122 


CNS003BD 


AL064091 Drosophil 


14 113.2 


2.0 


1101 


122 


CNS003BD 


AL064091 Drosophil 


15 111.8 


2,0 


1101 


122 


CNS0021J 


AL061936 Drosophil 


16 110.2 


2.0 


1101 


122 


CNS0039G 


AL063921 Drosophil 


17 110 


2.0 


1225 


123 


CNS0161D 


AL106171 Drosophil 


18 109.8 


2.0 


1101 


123 


CNS0145U 


AL103740 Drosophil 


19 109.6 


2.0 


1101 




INaVVBrU 


AL069493 Drosophil 


20 109.4 


2.0 


1101 


122 


CNS00BO1 


AL057419 Drosophil 


21 108.6 


2.0 


876 


122 


CNS009G1 


AL053529 Drosophil 


22 108.2 


2.0 


1101 


122 


CNS0O0B8 


AL063632 Drosophil 


23 108.2 


2.0 


1101 


122 


CNS003BB 


AL064089 Drosophil 


24 108 


1.9 


734 


122 


CNS010MP 


AL099163 Drosophil 


25 -108 


1.9 


836 


122 


CNS011Q0 


AL099642 Drosophil 


26 107 


1,9 


935 


120 


B10881 


B10881 F24H6-Sp6.1 


27 106.2 


1.9 


1101 


122 


CNS001FB 


AL060732 Drosophil 


28 105.8 


1.9 


836 


122 


CNS01100 


AL099642 Drosophil 


29 105.2 


1.9 


1101 


122 


CNS004ZW 


AL055440 Drosophil 


30 105 


1.9 


928 


122 


CNS00DKY 


AL071865 Drosophil 


31 105 


1.9 


1101 


122 


CNS00RAE 


AL077628 Drosophil 


32 104.4 


1.9 


1101 


122 


CNS003DX 


AL064587 Drosophil 


33 103.8 


1.9 


1101 


122 


CNSOO0B8 


AL063632 Drosophil 


34 103.6 


1.9 


1101 


123 


CNS0145U 


AL103740 Drosophil 


35 103.4 


1.9 


1101 


122 


CNS003BB 


AL064089 Drosophil 


36 103.2 


1.9 


1101 


122 


CNS003DQ 


AL064580 Drosophil 


37 102.8 


1.9 


890 


93 


AQ026918 


AQ026918 CIT-HSP-2 


38 102.4 


1.8 


905 


122 


CNS00KHX 


AL077798 Drosophil 


39 102.4 


1.8 


916 


64 


AW155504 


AW155504 mgie0028l 


40 102.2 


1.8 


876 


122 


CNS009G1 


AL053529 Drosophil 


41 102.2 


1.8 


1101 


122 


CNS00EPO 


AL069493 Drosophil 


42 101.4 


1.8 


928 


122 


CNS00DKY 


AL071865 Drosophil 


43 101.4 


1.8 


996 


122 


CNSQ0FUH 


AL071063 Drosophil 


44 101 


1.8 


828 


113 


AQ739398 


A0739398 HS 5482J 


45 101 


1.8 


828 


113 


AQ739398 


AQ739398 HS.5482J 



RESULT 1 
CNSO0EVL 

LOCOS CNS00EVL 



04-JUN-1999 



1101 bp DNA I 
DEFINITION Drosophila melanogaster genome survey sequence T7 end of BAC: 

BACR29B23 of RPCI-98 library from Drosophila melanogaster (fruit 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 
ORGANISM 



AUTHORS 

TITLE 

JOURNAL 



FEATURES 
source 



BASE COUNT 
ORIGIN 



fly), genomic survey sequence. 
AL069706 

AL069706.1 GI:4949849 
GSS. 

fruit fly, 

Drosophila melanogaster 

Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 
Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; 
Muscomorpha; Ephydroidea; Drosophilidae; Drosophila. 
1 (bases 1 to 1101) 
Genoscope. 
Direct Submission 

Submitted (02-JUN-1999) Genoscope ■ Centre National de Sequencage : 
BP 191 91006 EVRY cedex • FRANCE (E-mail : seqref@genoscope.cns.fr 
■ Web : wvw.genoscope.cns.fr) 

Determination of this BAC-end sequence was carried out as part of a 
collaboration with the Berkeley Drosophila Genome Project (BDGP) . 
The BDGP is constructing a physical map of the Drosophila 
melanogaster genome using these BACs. For further information 
please see http://www.fruitfly.org The BDGP Drosophila 
melanogaster BAC library was prepared by Kazutoyo Osoegawa and 
Aaron Mammoser in Pieter de Jong's laboratory in the Department of 
Cancer Genetics at the Roswell Park Cancer Institute in Buffalo, 
NY. The library is named RPCI-98 and was constructed by partial 
EcoRI digestion of Drosophila DNA provided by the BDGP from the 
isogenic strain y2; cn bw sp, the same strain used for the BDGP's 
Pi and EST libraries. A more detailed description of the library 
and how to order individual BAC clones, the entire library, or 
filters for hybridization from the BACPAC Resource Center can be 
found at http://bacpac.med, buffalo.edu/drosophilaJac, htm, 

Location/Qualifiers 

1. .1101 

/organism- " Drosophila melanogaster " 
/db_xref-"taxon:7227" 



/clone_lib-"RPCl-S 
/clone-"BACR29B23" 
/note- "end : T7" 
419 a 91 c 60 g 



299 t 232 others 



Query Match 2.6%; 
Best Local Similarity 38.7%; 
249; Conservative 130, 



Matches 


Qy 


3334 


Db 


457 


Qy 


3394 


Db 


517 


Qy 


3454 


Db 


577 


Qy 


3513 


Db 


637 


Qy 


3573 


Db 


697 


Qy 


3633 


Db 


757 


Qy 


3693 


Db 


813 



Score 145.6; DB 122; Length 1101;- 
Pred. No. 8.1e-13; 
Mismatches 260; Indels 5; Gaps 2; 



I hl:h: 



:h! h I: : I Ihl 



I :IH:|I :| :| II: 



1111:1 llllllll: I 1:1 11:11111 



II I llll Ill I II III 



II lllll Mh I I 111:1 



I III I: hi II II :lll|:l I : I I |:| |:| 



I: III I :| :lh:: h 



I :ll I: I: : :| llll 
--ATAWAATAWAAWAWAWATAAATAW 812 



:| :|| : :|: II :| :| II I :l II hill: hi 
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Db 3812 TTTGTAAATTTTAAAATAATCACATTTTGTTTATTTCTTTTTTATCGATAATATT - - -GG 3756 
Qy 3400 taatttagtctattttttcaaaataaaatttaaatctaaataaaaataatttttccttaa 3459 

i ii Minimi i i mi i 1 1 n i i mi in 

Db 3755 TGGATTTGTCTATTTTTTTAGGAATTCATTTTATTATGTATTATCACTTTTTTGTTTTAT 3696 

Qy 3460 tgttgaaacaactcatgttatacttcaaaattataagtattatatttaccttgatgattt 3519 

III II II I I II I I II I III 
Db 3695 TCATAATTATTTTGAAAATAGTAAATACCGTGTAAATATACAAACCTAAAAATGTTATTA 3636 

Qy 3520 atttattagtatattaattctgattataattatggtgggatacaatcgctttccactaaa 3579 

I III III I II II I II I II I III I III llll 

Db 3635 ACTTTTAAGTTTTTTTTTTTTTTTTTTTTTTTTATATTAAGAATAATTGTAACCATTAAA 3576 

Qy 3580 tattttaactatgatttataaatttatttcaacatcgtatatttacttattaatacataa 3639 

III! I II III II llll III I I III III II I III 
Db 3575 TATTGGAGTATAAATAAATATATATATTATAAC--GAGACAATTAGTTAAAAAAAAATAG 3518 

Qy 3640 tttatcataattttatggaaattgagaccaagaaacattaagagaacaaattctataaca 3699 

II I I II I I III I I II I I II I I I I III 

Db 3517 TTAAAAAAAAATCGTTAAAAAAAAATATGAAAATAAATGGATATATAATTGAATGAATAA 3458 

Qy 3700 aagacaatttagaaaaaaatgtacttttaggtaattttaagtactcttaaccaaacacaa 3759 

I I II III I III II II I lllll Mill II III II 
Db 3457 CATAAAA- - -AGATGACAATTTATCAAACTGTTAATTTAAAATAACTTAATCATACAAAA 3401 

Qy 3760 aaattcaaatcaaatgaactaaataagataatataacatacggaacatcttacttgtaat 3819 

III II III I III llll II III I II I 

Db 3400 AAAAAGGAACAAAAACAGGAAAAAGGAATAAAGTGTAAAGAAACACAAACAATTTAAAGA 3341 

Qy 3820 cttacattcccataattttattatgaaaaataatcttatattactcgaactaaatgttgt 3879 

II III llllllll III II Hill I II II I I 
Db 3340 CAAGCGAATTTATGAATTTATTATTTAAAGGTATATTATATATATAATCATAGATAATAT 3281 

Qy 3880 cacaaattattatctaaataaagaaaaacacttaatttttataacattttttcatatatt 3939 

III I I I I III I II I I Mil II I I 
Db 3280 TTAAAAGCAAAAAAAAGACAAACATATATCAGATTTGTATGTAATATAAGAATAATAAAG 3221 

Qy 3940 tgaaagattatattttgtatatttacgtaaaaatatt 3976 

II I II III II I III I I III 
Db 3220 TGGATATATACGTTTATTAAAATTAGTTCCAGTCATT 3184 



RESULT 15 
PCT-US92-00018-1/C 
Sequence 1, Application PC/TDS9200018 
GENERAL INFORMATION: 
APPLICANT: Hoffman, Stephen L. 
APPLICANT: Charoenvit, Yupin 
APPLICANT: Hedstrom, Richard 
APPLICANT: Khusmith, Srisin 
APPLICANT: Rogers IV, William 0. 

title of INVENTION: Protective malaria sporozoite surface protein 
TITLE OF INVENTION: immunogen and gene encoding 
NUMBER OF SEQUENCES: 2 



, David Spevack 
STREET: NMRDC Building 1 M2 National Naval 
STREET: Medical center 
CITY: Bethesda 
STATE: MD 
COUNTRY: USA 
ZIP: 20814-5044 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
SOFTWARE: Patentm Release #1.24 
CURRENT APPLICATION DATA: 
APPLICATION NUMBER: PCT/US92/00018 
FILING DATE: 19920103 
CLASSIFICATION: 424 
ATTORNEY/AGENT INFORMATION: 



NAME: Spevack, Avram D, 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (301) 295-6759 
TELEFAX: (301) 295-4033 
INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 4673 base pairs 

TYPE: NUCLEIC ACID 

STRANDEDNESS: double 

TOPOLOGY: linear 
MOLECULE TYPE; DNA (genomic) 
HYPOTHETICAL: N 
ANTI-SENSE: N 
ORIGINAL SOURCE: 

ORGANISM: Plasmodium yoelii 

STRAIN: 17X(NL) 

DEVELOPMENTAL STAGE: erythrocytic stage 

TISSUE TYPE: Blood 

CELL TYPE: erythrocytic stage 

IMMEDIATE SOURCE: 

LIBRARY: Py-lambdagtll-2-7 kb genomic expression 

CLONE: PylO.llll 
FEATURE: 
NAME/KEY: CDS 
LOCATION: 718.. 3195 
OTHER INFORMATION: 
PCT-US92-00018-1 



Query Match 1.4%; Score 79; DB 6; Length 4673; 

Best Local Similarity 46.8*; Pred. No. 6.6e-06; 

Matches 354; Conservative 0; Mismatches 395; Indels 8; Gaps 3; 

Qy 3220 atgtaatattatattttaaaataaaattatgttatttagattcttaatattttggagcat 3279 

I I I II III II llll I II I III I II llll I I 
Db 3932 AAGGATGATAATACTTCAAAAAATCATGAGCCAACTTTAGATATTCTCTTTTTTAACATT 3873 

Qy 3280 tccatactataatttcgtaacataatattaaaatatagtaatataaagtgtaattaactt 3339 

I I I I III I I I III I III II I I 
Db 3872 CCATCATTTTTTTTTATCACACTTTTTAGTTCATAAAACTTAAGACCATTATTTTTATGT 3813 

Qy 3340 taaattacaagcataatattaaattttgaatcaattaatttttatttctattattttaat 3399 

III I MM I I III I I II MUM II llll 
Db 3812 TTTGTAAATTTTAAAATAATCACATTTTGTTTATTTCTTTTTTATCGATAATATT---GG '3756 

Qy 3400 taatttagtctattttttcaaaataaaatttaaatctaaataaaaataatttttccttaa 3459 

I II MINIMI I I llll III II I I llll III 
Db 3755 TGGATTTGTCTATTTTTTTAGGAATTCATTTTATTATGTATTATCACTTTTTTGTTTTAT 3696 

Qy 3460 tgttgaaacaactcatgttatacttcaaaattataagtattatatttaccttgatgattt 3519 

III I I II I I II llll llll 
Db 3695 TCATAATTATTTTGAAAATAGTAAATACCGTGTAAATATACAAACCTAAAAATGTTATTA 3636 

Qy 3520 atttattagtatattaattctgattataattatggtgggatacaatcgctttccactaaa 3579 

Ml I III I II II I II I II I III I III llll 

Db 3635 ACTTTTAAGTTTTTTTTTTTTTTTTTTTTTTTTATATTAAGAATAATTGTAACCATTAAA 3576 

Qy 3580 tattttaactatgatttataaatttatttcaacatcgtatatttacttattaatacataa 3639 

llll I II III II llll III I I III III II I III 
Db 3575 TATTGGAGTATAAATAAATATATATATTATAAC--GAGACAATTAGTTAAAAAAAAATAG 3518 

Qy 3640 tttatcataattttatggaaattgagaccaagaaacattaagagaacaaattctataaca 3699 

III llll I III II II I I II I I I I III 
Db 3517 TTAAAAAAAAATCGTTAAAAAAAAATATGAAAATAAATGGATATATAATTGAATGAATAA 3458 

Qy 3700 aagacaatttagaaaaaaatgtacttttaggtaattttaagtactcttaaccaaacacaa 3759 

I I II III I III II II I Mil lllll II III II 
Db 3457 CATAAAA- • -AGATGACAATTTATCAAACTGTTAATTTAAAATAACTTAATCATACAAAA 3401 

Qy 3760 aaattcaaatcaaatgaactaaataagataatataacatacggaacatcttacttgtaat 3819 

III II III I III llll II III I II I 
Db 3400 AAAAAGGAACAAAAACAGGAAAAAGGAATAAAGTGTAAAGAAACACAAACAATTTAAAGA 3341 
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3027 gttggctggtc tacccaagagtgatcaaagtttgagctgccttcaatgagcca 3079 

III I I llll I III II III I 
1012 ATTGAATAAACTGTTTTATACCTCTTTTATAAAAAAAAAAATAATAATTATAATAAATAG 953 

3080 atttttgcccataatggataaaggcaatttgtttagttcaactgctcacagaataatgtt 3139 

II llll III I III I III II III I II 
952 GGATTGCTTACAAATGATAAAATGAAATATACAGAATATATTATAGAATACTATAGTTTT 893 

3140 aaaatgaaattaaaataaggtggcctggtcacacacacaa aaaaaaacta 3189 

II I II II I II llll llll I II 

892 ATATAGGAACCAATAAGATATATATACTTTIAATAACAACTTTGTGATGTTAAAAGAATA 833 

3190 atgttggttggttgaattttatattacggaatgtaatattatattttaaaataaaattat 3249 

I II II III II II II I II I I III 
832 AAACTGTTTAAGACCTATGATTCAGAGAATATCCCCAATAATTATATATATATATATATT 773 

3250 gttatttagattcttaatattttggagcattccatactataatttcgtaacataatatta 3309 

772 TATATATATATCTATATTATTTTTTCCCOT 713 

3310 aaatatagtaatataaagtgtaattaactttaaattacaagcataatattaaattttgaa 3369 

I I INI I I I I II I II I llll II III 
712 TATGTTTAAAATATTTATAAATTTACATATACAAGTTCATTTTTCATATGTAAATTTTTT 653 

3370 tcaattaatttttatttctattattttaattaatttagtctattttttcaaaataaaatt 3429 

I II Mill III I II llll II llll II I I I 

652 TTTTTTCTTTTTTTTTTTTTTTTTTTTTTTTTTTTTAAATTAGTAGAATTACTATTTTAT 593 

3430 taaatctaaataaaaataatttttccttaatgttgaaacaactcatgttatacttcaaaa 3489 

II I II llllllll I I II I I I II I II 

592 AAACTAAGAAAAAAAATAAATAAATGAATAAATTAATAAATAAATATATAAATAAAATTA 533 

3490 ttataagtattatatttaccttgatgatttatttattagtatattaattctgattataat 3549 

I II I Mill II II llll II llll I II II II I 
532 TAGGAACCACAATATTGGGGAGTATTATATATTGTGTATAATATATAGGATGGTTTTATT 473 

3550 tatggtgggatacaatcgctttccactaaatattttaactatgatttataaatttatttc 3609 

I I II II llllllll Mill 
472 ATAAGAAGTGTAAAACTATATTAATGTGIACACATCAAAATATTAATAATTGTATTCATA 413 

3610 aacatcgtatatttacttattaatacataatttatcataattttat-ggaaattgagac 3667 

II I I II II I II I I llll llllllll llll I II 
412 TTAAITGGAAATATATTAATAAGTTTTATAITTCAAGTAATTTTATAAACAAATGAACAC 353 

3668 caagaaacattaagagaacaaattctataacaaagacaatttagaaaaaaatgtactttt 3727 

llllllll II llll I Mil I I III llll 
352 ACAAACATATATATATATATATATATATATATATATATATATATATATAAAATAACTTAA 293 

3728 aggtaattttaagtactcttaaccaaacacaaaaattcaaatcaaatgaactaaataaga 3787 

I III I llll I I I II llll lllll III III 
292 ATGTATTGTTAATAAATATAAAGAAAAAAAAAAAAAAAAAAAGTTTTTTATCTATGTTAT 233 

3788 taatataacatacggaacatcttacttgtaatcttacattcccataattttattatgaaa 3847 

llllll I I llll III I II II III III 

232 TAATATGAATAATCATTATATATACTACATATCAAATATAAATATTTTTTATCTITTTAT 173 

3848 aataatcttatattact 3864 

I I II lllll I 
172 TTTTTTATTTTATTATT 156 



RESULT 12 
US-08-883-795A-36/C 

Sequence 36/ Application US/08883795A 

Patent No, 5985607 
GENERAL INFORMATION: 

APPLICANT: Delcuve, Genevieve 
APPLICANT: Awang, Gregor 

TITLE OP invention: Recombinant DNA Molecules and Expression 
TITLE OF INVENTION: Vectors for Tissue Plasminogen Activator 
NUMBER OF SEQUENCES: 39 
CORRESPONDENCE ADDRESS; 



ADDRESSEE: BERESKIN S PARR 

STREET: 40 King street West 

CITY: Toronto 

STATE: Ontario. 

COUNTRY : Canada 

ZIP: M5H 3Y2 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM; PC-DOS/MS-DOS 

SOFTWARE; Patentln Release #1.0, Version tl,25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/883 ,795A 

FILING DATE: 27-JUN-1997 

CLASSIFICATION: 435 
ATTORNEY/AGENT INFORMATION: 

NAME: Gravelle, Micheline 

REGISTRATION NUMBER: 40,261 

REFERENCE/DOCKET NUMBER: 7841-062 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (416) 364-7311 

TELEFAX: (416) 361-1398 
INFORMATION FOR SEQ ID NO: 36: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 665 base pairs 

TYPE: nucleic acid 

STRANDEDNESS: single 

TOPOLOGY: linear 
MOLECULE TYPE: CDNA 
ORIGINAL SOURCE: 

ORGANISM: Homo sapiens 
IMMEDIATE SOURCE: 

CLONE: Rh 32 
US-08-883-795A-36 



Query Match 1.54; Score 82.4; DB 4; Length 665; - 

Best Local Similarity 51.0*; Pred, No. 1.2e-06; 

Matches 252; Conservative 0; Mismatches 231; Indels 11; Gaps 

Qy 3168 tcacacacacaaaaaaaaactaatgttggttggttgaattttatattacggaatgtaata 3227 

I I I I II II I III I lllll llll III I I 
Db 535 TTATAAATACTTTAATTATAAAATATGTAATTATAAATACTTTAATTATAAAATATGTAA 476 

Qy 3228 ttatattttaaaataaaattatgttatttagattcttaatattttggagcattccatact 3287 

lllll I III lllll III I I III I I II II I 
Db 475 TTATAAATACTTTATAAAATATGTAATTATAAAATATGTAATTATAAACATTTTAATTAT 416 

Qy 3288 ataatttcgtaacataatattaaaatatagtaatataaagtgtaattaactttaaattac 3347 

I III I I llll I I Ml I III II ninni I III 
Db 415 AAAATATGTAATTATAA— -ACATTTTAATTATAAAATATGTAATTATAAACATTTTAA 360 . 

Qy 3348 aagcataatattaaattttgaatcaattaatttttatttctattattttaattaatttag 3407 

I lllll llll I II llllll II I I I III III I Mil 
Db 359 TTATAAAATATGTAATTATAAACATTTTAATTATAAAATATGTAATTATAAACATTTTAA 300 

Qy 3408 tctattttttcaaaataaaatttaaatctaaataaaaataatttttccttaatgttgaaa 3467 

I I III I I I I I llllll II I I III II II 
Db 299 TTATAAAATATTTAATTATAAACATTTTAATTATAAAATATTTAATTATAAATATTTTAA 240 

Qy 3468 caactcatgttatacttcaaaattataagtattatatttaccttgatgatttatttatta 3527 

I I I II II llll I llllll llll 
Db 239 TTATAAAATATTTAATTATAAATATTTTAATTATAAAATATTTAATTATAAATATTTTAA 180 

Qy 3528 gtatattaattctgattataattatggtgggatacaatcgctttccactaaatattttaa 3587 

' llll I I I lllllll III I II II llllllllllll 
Db 179 TTATAAAATATTTAATTATAAATATTTTAATTATAAAATATTTAATTATAAATATTTTAA 120 

Qy 3588 ctatgatttataaattta tttcaacatcgtatatttacttattaatacataat 3640 

III I III I III III I II lllllll llll llll I I 
Db 119 TTATAAAATATTTAATTATAAATATTTTAATTATAAAATATTTAATTATAAATATTTTAA 60 

Qy 3641 ttatcataatttta 3654 
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APPLICANT: Chang, Andy C M 
APPLICANT: Williams, Keith L 

TITLE OF INVENTION: Improved Plasmid Vectors for Cellular . 
TITLE OF INVENTION: Slime Moulds of the Genus Dictyostelium 

NUMBER OF SEQUENCES: 19 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Woodcock Washburn Kurtz Mackiewicz & No. 538952Sris 

STREET: One Liberty Place 46th Floor 

CITY: Philadelphia 

STATE: PA 

COUNTRY; USA 

ZIP: 19103 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC -DOS/MS -DOS 

SOFTWARE: Patentln Release #1.0, Version 11.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/07/867,106 

FILING DATE: 19920625 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: AU PJ 7187 

APPLICATION NUMBER: PCT/AD90/00530 

FILING DATE: 02-NOV-1989 
ATTORNEY/AGENT INFORMATION: 

NAME: Feeney, Joanne Longo 

REGISTRATION NUMBER: 35,134 

REFERENCE/DOCKET NUMBER: RICE-0002 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 215-568-3100 

TELEFAX: 215-568-3439 
INFORMATION FOR SEQ ID NO: 2: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 5852 base pairs 

TYPE: NUCLEIC ACID 

STRANDEDNESS: single 

TOPOLOGY: linear 
MOLECULE TYPE: DNA (genomic) 
ANT I -SENSE: NO 



NAMEAEY: CDS 
LOCATION: 2378.. 5038 



NAME/KEY: CDS 
LOCATION: 2378,. 5038 
-07-867-106-2 



Query Match 1.5*; Score 85; DB 1; Length 5852; 

Best Local Similarity 46.lt; Pred. No. 6e-07; 

Matches 403; Conservative 0; Mismatches 460; Indels 12; Gaps 3; 

Qy 3120 actgctcacagaataatgttaaaatgaaattaaaataaggtggcctggtcacacacacaa 3179 

I I I III II Mil III III II I I II II 

Db 1381 AGTACGAACATAAATATGTATAAACCAAAAAAATTGATTAAGATAAAGTTATATGTTTGT 1440 

Qy 3180 aaaaaaactaatgttggttggttgaattttatattacggaatgtaatattatattttaaa 3239 

I II III I I MINIMI I I II II I I I I 

Db 1441 ATTTAATAAAATAGTTTAGTTTAAAATTTTATATCATTTTTTAAAAAATGAAAATGTTTG 1500 

Qy 3240 ataaaattatgttatttagattcttaatattttggagcattccatactataatttcgtaa 3299 

Db 1501 AAAAAAAAAATm 1560 

Qy 3300 cataatattaaaatatagtaatataaagtgtaattaactttaaattacaagcataatatt 3359 

I III II I I II III I III llll I II I I 

Db 1561 TAAAAGTTATTAACAAATATGTAAAAATTATAAAAAACTAACCTAGTTATAATTACTTTC 1620 

Qy 3360 aaattttgaatcaattaattttt— atttctattattttaattaatttagtctattttt 3416 

Db 1621 CCCTCTTTTTTTTTTTTTTTTTTGTCATGACACm 1680 

Qy 3417 tcaaaataaaatttaaatctaaataaaaataatttttccttaatgttgaaacaactcatg 3476 



I II llll III I I MM I III III III 

Db 1681 TTTTAAAAAAAAAAAAAAAAATGTTAAAATACTATTTGATGACATTCATTTTTCCTAGTT 1740 

Qy 3477 ttatacttcaaaattataagtattatatttaccttgatgatttatttattagt-atatt 3534 

II I I II lllll llll I III I II III I I III 

Db 1741 TTTTTTTAGATAGATATAAAAATAAATTGCCTATCGATATATACTTAATTTATTAAGATT 1800 

Qy 3535 aattctgattataattatggtgggatacaatcgctttccactaaatattttaactatgat 3594 

I I III lllll I II I I III I I llll I I I 

Db 1801 GAATAATATTTTAATTTTTAATAAATTCTACTTTTTTTTTTTTTTTCTTTTTTTTTTAAA 1860 

Qy 3595 ttataaatttatttcaacatcgtatatttacttattaatacataatttatcataatttta 3654 

II llll II III I II II I I lllll I HUM II I 

Db 1861 TTTTAAAATTTTTTTTTTTTATTAGATCTCATAATTAAAAATCAATTTAAAATTAAAAGT 1920 

Qy 3655 tggaaattgagaccaagaaacattaagagaacaaattctataacaaagacaatttagaaa 3714 

I I I I I llll III I I II I I I I III I 
Db 1921 TATTTTTAAATATGCAAAAACTATAAAAAACTAATGTAGTTTAACCAACTTTTTTCTATT 1980 

Qy 3715 aaaatgtacttttaggtaattttaagtactcttaaccaaacacaaaaattcaaatcaaat 3774 

I I llll I llll I II III I lllll III III 
Db 1981 TCTTTTTTTTTTTTTTTTTTTTTTTTACTTTGAAAAAAAAAAAAAAAAAAAAAAAAAAAA 2040 

Qy 3775 gaactaaataagataatataacatacggaacatcttacttgtaatctt acatt 3827 

III lllll I III I II II I II I II 

Db 2041 AAACCCTCATTATAAATATTAATTACTTTGGTTTTTTTTGATTTTTTTTTTAATAAATTT 2100 

Qy 3828 cccataattttattatgaaaaataatcttatattactcgaactaaatgttgtcacaaatt 3887 

■ II I II I II II I I III I I III I lllll 
Db 2101 AAAATTTTATTCTCTATCTAATTATACCTTATTTATAAATATTGGATAATATATCAAATA 2160 

Qy 3888 attatctaaataaagaaaaacacttaatttttataacattttttcatatatttgaaagat 3947 

lllll I I I Ml I II III lllll II III I 
Db 2161 TTTATCAGTTTTGGCATGACAATTTTAATTATATTTATTTTTTGATTAATTTTTTTTTTT 2220 

Qy 3948 totattttgtatatttacgtaaaaatatttgacat 3982 

I I llll II I II I I I III II 
Db 2221 TTTTTTTTTTAAAATTTCTTTTTTTTTTTTTTTAT 2255 



RESULT 10 
US-08-883-795A-36 
Sequence 36, Application US/08883795A 
Patent No. 5985607 
GENERAL INFORMATION: 
APPLICANT: Delcuve, Genevieve 
APPLICANT: Awang, Gregor 

TITLE OF INVENTION: Recombinant DNA Molecules and Expression 
TITLE OF INVENTION: Vectors for Tissue Plasminogen Activator 
NUMBER OF SEQUENCES: 39 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: BERESKIN S PARR 

STREET: 40 King Street West 

CITY: Toronto 

STATE: Ontario 

COUNTRY: Canada 

ZIP: M5H 3Y2 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release 11.0, Version #1,25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/883, 795A 

FILING DATE; 27-JUN-1997 

CLASSIFICATION: 435 
ATTORNEY/AGENT INFORMATION: 

NAME: Gravelle, Micheline 

REGISTRATION NUMBER: 40,261 

REFERENCE/DOCKET NUMBER: 7841-062 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (416) 364-7311 

TELEFAX: (416) 361-1398 
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COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC'DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version 11.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/487, 8266 

FILING DATE: 10-SEP-1993 

CLASSIFICATION: 435 
ATTORNEY/AGENT INFORMATION: 

NAME: Israelsen, Ned 

REGISTRATION NUMBER: 29,655 

REFERENCE/DOCKET NUMBER: HIH121.001CP1 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (619) 235-8550 

TELEFAX: (619) 235-0176 
INFORMATION FOR SEC* ID NO: 13: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 19124 base pairs 

TYPE: nucleic acid 

STRANDEDNESS: Single 

TOPOLOGY: linear 
MOLECULE TYPE: CDNA 
HYPOTHETICAL: NO 
ANTI-SENSE: NO 
US-08-487-826B-13 



Query Match 1.8*; Score 100.4; DB 4; Length 19124; 

Best Local Similarity 44.1%; Pred. No. 1.4e-09; 

Matches 877; Conservative 0; Mismatches 1091; Indels 22; Gaps 10; 

Qy ' 1970 attgaaacgtttaagaatttttactactgcaaattcagaataagtgaatttgttttttag 2029 

III II II III I I II II II II I III I II I I 
Db 4660 ATTCATAATTTAGAGATTATGTAATATTGTTTATGTATCGTAATATATATTAATATAATT 4719 

Qy 2030 aaagattaaataagttagtattacgatttttagtttgatttggtggaaagtaatgtatgt 2089 

II I II II I I I I I II IMI II I I I III 

Db 4720 GTTTTTTTAGTATGTATGGTATTCTAATAATATATTCATATGTAGTCATAGTGTCAATGA 4779 

Qy 2090 ttttgaacataattatttgacaataattaagttttctagggaataaacggaaatatcttc 2149 

I I II III I II III I I II II III III II 

Db 4780 ATATAAAATATGGTATATTTATATTATTGTATATATTAAATAAGTAACACAGA-ACATTA 4838 

Qy 2150 ttcttttttgtaaaattactaatgcaagaacaaacaacgttttggggagcaaataatcta 2209 

I I I Mil I II I I III I I II I II 

Db 4839 TATATAGTAATAAATAGAAGAAATAATATATTTTTATGTTATATATTATTAGTTATTATA 4898 

Qy 2210 gctttaagtagtcagtgtaactctcaaaatctggtcataacttctaggctgagtttgctg 2269 

II III II I I Mill I III I III I II 
Db 4899 AAGGGGAAAATTCATAATATTTATGAAAATTTTTGTATATGATATAGTTATAAGTTAAAA 4958 

Qy 2270 tgctacagtagtaagtctatagaaacttacctgacaaaacgacatgacgtcagggtcgaa 2329 

I I I III I II III llll II III I II 
Db 4959 AAAAAAAAAAACAAGAACAAAAATGGAAAGCATAAAAAATGTTACTGTAATAGGATAAAA 5018 

Qy 2330 tctacaacttttcctttttcttcaattaacatatggttgattcaagttccgatctataat 2389 

I II I I I III II II II llll III III 
Db 5019 TATATTATATAAAATGTTTATTTTATCTTAAAAAGGTTCCTATTATAACATTAAAAAAAA 5078 

Qy 2390 aatttattacgatttatcaatttcaattaccttatatcatcctattataaatataagtca 2449 

II I III I II I I III II II III III II 

Db 5079 TTTGTCCCATTTTATAAATAATTAACTACATTTACATAATGAAATTTCGATTTTGTGTTT 5138 

Qy 2450 gttcaattcagttttcgaaagttcccaaaaattttgaattttattaaatttattccctaa 2509 

II II I II I I II Mill I I I II I 

Db 5139 TTTTGATGAATATTATGGACTAATTATTTATATGTGAATGCGTTCTATATAATAATAATA 5198 

Qy 2510 aaccgaaatagttatatctttcaaatttaagtttcatttttcaatccgattt-caatttc 2568 

I I II I II II II II I I I II Ml I I 
Db 5199 ATTTTATTTAAAAAAATGAAAAATAAGAAATAAATATCCTGATTTTGTAGTTCCAATAGC 5258 

Qy 2569 atccttttataactctctattatctataattacataaatttcaaattaattttgaaatat 2628 

I I Ml I M III llll I I II I I II I II II 
Db 5259 TTAATATAATTATGGACTCATATATATATTATATATATCTTTACAACAAGTAATAAGTAA 5318 



Qy 2629 ttacactttagtccctaagttcaaaactataaattttcactttagaaattaatcattttt 2688 

II III I III Ml II llll I I I llll I I 
Db 5319 ATATTATTTTAATCTTAATAAGGAAAATAAAAATAATAAAATAAGAA— TACTGAATAA 5375 

Qy 2689 cacatctaagcatcaaatttaaccaaatgacacaaatttcatgattagttagatcaagct 2748 

I III II I III Mill III I II M Mill I 

Db 5376 TAAGTCATATTATACATTTTTTAAAAATGTAACATAATTACAAATACGTAACATGTATTA 5435 

Qy 2749 tttgagtcttcaaaacataaaaattacaaaaaaaaaacaaacttaaaatcatttatcaat 2808 

I I I I I II III I I I I III I I I I 1 1 1 1 1 1 1 llll I 

Db 5436 TAGAAATAATAAGAATTTAATATTAAGGATAAATATAAATATTTAAAATTATATTTTTTT 5495 

Qy 2809 ttgaacaacaaagcttggccgaatgctaagagcttaaaaatggcttcttttgtttctttt 2868 

II I III II II I I I I I I II I I III 

Db 5496 ATGTCAATTTATGTTATATTATATTATATTAACATGATTA-GTTTTTTGAAAAATATTTA 5554 

Qy 2869 tgttgcaaacggtggagagaagagggaaatgaagattgaccatatttttttattatgttt 2928 

I II I I I II I I I II I I I II I I 
Db 5555 AATATCATATAATAATAATAAATTAGTTAAAATAATAGTATTTCATACAAAATACTAACT 5614 

Qy 2929 taacatataatattaataatttaatcataattatactttggtgaatgtgacagtggggag 2988 

M I II II I III III llll II III I Ml llll 
Db 5615 TATAAGTATATCATATAATATTATATATATATATATTTATGTGTTTTTGATTGGGTGTAT 5674 

Qy 2989 atacgtaaagtattttaacattatactttttgcaagcagttggctggtctacccaagagt 3048 

III I I I I II I II I I I Mill 
Db 5675 ATAAGGCTATAAGTATATATGGGTTGTTCATTATATATTTATATGTGAATAGATACATAT 5734 

Qy 3049 gatcaaa gtttgagctgccttcaatgagccaatttttgcccataatggataaa 3101 

I II IMI I M II II I III I I III 
Db 5735 AAGTTAATATATTTATTTGTGTATATGTCTGTGTTAAGATAGATATGCATTACAGTTAAG 5794 

Qy 3102 ggcaatttgtttagttcaactgctcacagaataatgttaaaatgaaattaaaataaggtg 3161 

Db 5795 GGTTATAGTTTTTTTTTTTTT ' G c'l' ' AAAAAAT ' G T ' ' C AACAAT 5854 

Qy 3162 gcctggtcacacacacaaaaaaaaactaatgttggttggttgaattttatattacggaat 3221 

I I III I III III I I I I I Mill I I II 

Db 5855 TGCATATTACAAGAATAATATTTGTATAAAATATATATATATATATATATATAAAGACAT 5914 

Qy 3222 gtaatattatattttaaaataaaattatgttatttagattcttaatattttggagcattc 3281 

III I llll I I III I I Ml II I llll I 'II 

Db 5915 -TAAAACTATACTAATAGGTAATTAGTTTTATTATATCATCCTTTTATTATTATAATTTT 5973 

Qy 3282 catactataatttcgtaacataatattaaaatatagtaatataaagtgtaattaacttta 3341 

II I I III I II Ml III MUM M Ml II . 

Db 5974 TTTTGTTTTACTTCTTGTCGTTCTTTTTTGTTATTATAATATAACAAATATAAAACAATA 6033 

Qy 3342 aattacaagcataatattaaattttgaatcaattaatttttatttctattattttaatta 3401 

I II II MM III II I III I II II 
Db 6034 TCAGTATTTGGAATATAAATAAATTTATTCTACATATATGCATATATATATATATATATA 6093 

Qy 3402 atttagtctattttttcaaaataaaatttaaatctaaataaaaataattttt---cctta 3458 

I Mill I II I II II I I I I I I Nil I I 
Db 6094 TATATATATATATATATATATATATATATATGTATGATTTTATACTATTTTTATACATGC 6153 

Qy 3459 atgttgaaacaactcatgttatacttcaaaattataagtattatatttaccttgatgatt 3518 

II II II I II lllllll III III I II Ml llll I 

Db 6154 ATTTTTATATATTTTAGTATATACTTTAAAGATATTATTAATATTTATATAGTAGCATAT 6213 

Qy 3519 tatttattagtatattaattctgattataattatggtgggatacaatcgctttccactaa 3578 

I III III I I II I llll III I llll 
Db 6214 ATGTATTTATATTATAACAAATATTTTCATTTATATAAATATATAGAACATGAACATTTT 6273 

Qy 3579 atattttaactatgatt-tataaatttatttcaacatcgtatatttacttattaatacat 3637 

II II II II III II Mill I llll llllllllll I II 
Db 6274 ATTAATAACTCATATTTGAATATATATATTTATAATGTGTATTTTTACTTATTTTTTTAT 6333 

Qy 3638 aatttatcataattttatggaaattgagaccaagaaacattaagagaacaaattctataa 3697 

I I II llll llllll I I I I II II llll I I II 
Db 6334 ATTATACAATAAAATTTTGAAATTCATAAAATGCATGAAATACATAAAAAAATACAACAA 6393 
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Qy 4252 gcagcggctcgacgtttattcgagacacaagcaacctcatcagagctcccacaattggct 4311 

illinium mini mini iiiininmm iiiiiiiniii 

Db 133 TCAGCGGCTCGACATTTATTCCAGACACAAACAACCTCATCAGAGCTGCCACAATTGGCT 192 
Oy 4312 tcaaaatacgaaagcacgagagtctgaatacgaaaagccagaatacaaacagccaaagta 4371 

miiiimiii i ii ii ii i milium mi n 

Db 193 TCAAAATACGAAA AGCACAAAGAGTCTGAATACAAACAACCAAAATA 239 

Qy 4372 tcacgaagagtactcaaaacttgagaagcctgaaatgcaaaaggaggaaaaacaaaaacc 4431 

Mill Mill Mill millllll IIIIMI llllllllllllllllllll 
Db 240 TCACGAAAAGTACCCAAAACATGAGAAGCCTAAAATGCACAAGGAGGAAAAACAAAAACC 299 

Qy 4432 ctgcaaacagcatgaagagtaccacgagtcacacgaatcaaaggagcaaaaagagtacga 4491 

iiniiiii iiiiiiiiiiiiiiiiiiiiii mum mm 1 1 1 r 1 1 1 1 1 1 

Db 300 CTGCAAACATCATGAAGAGTACCACGAGTCACGCGAATCGAAGGAGCACGAAGAGTACGA 359 
Qy 4492 gaaagaaaatctcga 4506 

nun i mi 

Db 360 TAAAGAAAAACCCGA 374 



RESULT 5 
US-08-787-335-18 

Sequence 18, Application US/08787335 

Patent No. 5981834 

GENERAL INFORMATION; 



APPLICANT: John, Maliyakal E, 
APPLICANT: Umbeck, Paul F. 
APPLICANT: Brill, Winston J, 
TITLE OF INVENTION: GENET ICALY 
TITLE OF INVENTION: FOR ALTERED FIBER 
NUMBER OF SEQUENCES: 18 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Quarles and Brady 

STREET: P.O BOX 2113 

STREET: FIRST WISCONSIN PLAZA 

CITY: MADISON 

STATE: WISCONSIN 

COUNTRY: U.S.A. 

ZIP: 53701 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette - 3.50 inch, £ 
' COMPUTER: Apple Macintosh 

OPERATING SYSTEM: Macintosh 

SOFTWARE: Microsoft Word 4.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/787,335 

FILING DATE: 

CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/530,797 

FILING DATE: 

APPLICATION NUMBER: US 07/253,243 

FILING DATE: 04 -OCT -88 
ATTORNEY/AGENT INFORMATION: 

NAME: Nicholas J. Seay 

REGISTRATION NUMBER: 27,386 

REFERENCE/DOCKET NUMBER: 1122990245 
INFORMATION FOR SEQ ID NO: 18: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 1283 base pairs 

TYPE: nucleic acid 

STRANDEDNESS: single 

TOPOLOGY: linear 
MOLECULE TYPE: CDNA to mRNA 
HYPOTHETICAL: no 
ANTI-SENSE: no 
ORIGINAL SOURCE: 

ORGANISM: Gossypium hirsutum 

STRAIN: Coker 312 

DEVELOPMENTAL STAGE: 15 day old fiber cells 
TISSUE TYPE: fiber cells 
IMMEDIATE SOURCE: 



LIBRARY: CKFB15 
CLONE: E9 
US-08-787-335-18 



Query Match 4.8%; 
Best Local Similarity' 84,3%; 
Matches 316; Conservative 



Score 265.4; DB 4; Length 1283; 
Pred. No. 9.4e-39; 

); Mismatches 46; Indels 13; Gaps 1; 



Qy 4132 aatacacgttcttttctttctatttgattaaccatggctcatagcattcgtcaccctttc 4191 

i ii i mm Minimi i mniiimm i n mm mimi 

Db 13 ACTAAAAATTCTTTGCTTTCTATTTTGTAAACCATGGCTCATAACTTTTGTCATCCTTTC 72 

Qy 4192 ttccttttccaacttttactcataagtgtctcactagtgaccggtagccacactgtttcg 4251 

I [[ 1 1 1 ! 1 1 1 1 1 [ 1 1 1 1 1 1 1 1 1 1 I IMIIIMM I I Mil Mill II III 
Db 73 TTCCTTTTCCAACTTTTACTCATTACTGTCTCACTAATAATCGGTAGTCACACCGTCTCG 132 

Qy 4252 gcagcggctcgacgtttattcgagacacaagcaacctcatcagagctcccacaattggct 4311 

minim Mini mini mmmmimimm minimi 

Db 133 TCAGCGGCTCGACATTTATTCCAGACACAAACAACCTCATCAGAGCTGCCACAATTGGCT 192 

Qy 4312 tcaaaatacgaaagcacgagagtctgaatacgaaaagccagaatacaaacagccaaagta 4371 

IIIIIIIMIIII I II II II I MIMIIMM Mill II 

Db 193 TCAAAATACGAAA AGCACAAAGAGTCTGAATACAAACAACCAAAATA 239 

Qy 4372 tcacgaagagtactcaaaacttgagaagcctgaaatgcaaaaggaggaaaaacaaaaacc 4431 

iiiimi mil inn IMiiiiiii mini imiimmimimimm 

Db 240 TCACGAAAAGTACCCAAAACATGAGAAGCCTAAAATGCACAAGGAGGAAAAACAAAAACC 299 

Qy 4432 ctgcaaacagcatgaagagtaccacgagtcacacgaatcaaaggagcaaaaagagtacga 4491 

III IMM IIIIIIIIIIIIIIIIIIIIII II I Ml II II MM II MM 1 114 
Db 300 CTGCAAACATCATGAAGAGTACCACGAGTCACGCGAATCGAAGGAGCACGAAGAGTACGA 359 

Qy 4492 gaaagaaaatctcga 4506 

linn I Ml 

Db 360 TAAAGAAAAACCCGA 374 



RESULT 6 
US-08-487-826B-13/C 
Sequence 13, Application 
Patent No. 5993827 
GENERAL INFORMATION: 
APPLICANT: Sim, Kim L. 
APPLICANT: Chitnis, Chetan 
APPLICANT: Miller, Louis H. 
APPLICANT ; Peterson, David S. 
APPLICANT : Su, Xin-zhaun 
APPLICANT: Wellems, Thomas E. 

TITLE OF INVENTION: BINDING DOMAINS FROM PLASMODIUM VIVAX 

TITLE OF INVENTION: AND PLASMODIUM FALCIPARUM ERYTHROCYTE BINDING PROTEINS 

NUMBER OF SEQUENCES: 45 



ADDRESSEE: Rnobbe Martens Olson & Bear 
STREET: 620 Newport Center Drive 16th Floor 
CITY: Newport Beach 
STATE: California 
COUNTRY: US 
ZIP: 92660 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
SOFTWARE: Patentln Release #1.0, Version 11.25 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/487, 826B 

FILING DATE: 10-SEP-1993 

CLASSIFICATION: 435 
ATTORNEY/AGENT INFORMATION: 

NAME: Israelsen, Ned 

REGISTRATION NUMBER: 29,655 

REFERENCE/DOCKET NUMBER: MIB121. 001CP1 
TELECOMMUNICATION INFORMATION: 
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TISSUE TYPE: fiber cells 
IMMEDIATE SOURCE: 
LIBRARY: CKFB15 
CWNE: E9 
US-07-885-970A-17 



Query Match 4.8%; Score 265.4; DB 1; Length 1283; 

Best Local Similarity 84.34; Pred. No. 9.4e-39; 

Matches 316; Conservative 0; Mismatches 46; Indels 13; Gaps 

Qy 4132 aatacacgttcttttctttctatttgattaaccatggctcatagcattcgtcaccctttc 4191 

i ii i nun Milium i iiiiiiiiiniii i ii mi nun 

Db 13 ACTAAAAATTCTTTGCTTTCTATTTTGTAAACCATGGCTCATAACTTTTGTCATCCTTTC 72 

0y 4192 ttccttttccaacttttactcataagtgtctcactagtgaccggtagccacactgtttcg 4251 

lllllllllllllllllllllll I llllllllll I I llllll Mill II III 
Db 73 TTCCTTTTCCAACTTTTACTCATTACTGTCTCACTAATAATCGGTAGTCACACCGTCTCG 132 

Qy 4252 gcagcggctcgacgtttattcgagacacaagcaacctcatcagagctcccacaattggct 4311 

IMMMIMII 1 1 1 1 1 1 1 llllllll IMMMIIMIIMI MIMMMMI 
Db 133 TCAGCGGCTCGACATTTATTCCAGACACAAACAACCTCATCAGAGCTGCCACAATTGGCT 192 

Qy 4312 tcaaaatacgaaagcacgagagtctgaatacgaaaagccagaatacaaacagccaaagta 4371 

IMMIMIIIII I II II II I MIMMIMI Mill II 

Db 193 TCAAAATACGAAA AGCACAAAGAGTCTGAATACAAACAACCAAAATA 239 

Qy 4372 tcacgaagagtactcaaaacttgagaagcctgaaatgcaaaaggaggaaaaacaaaaacc 4431 

Miiiii inn iiiiii Minimi mm mmmiimimmiimi 

Db 240 TCACGAAAAGTACCCAAAACATGAGAAGCCTAAAATGCACAAGGAGGAAAAACAAAAACC 299 

Qy 4432 ctgcaaacagcatgaagagtaccacgagtcacacgaatcaaaggagcaaaaagagtacga 4491 

IMMIIII 1 1 1 1 1 1 1 1 ! 1 1 f 1 1 1 1 1 1 1 1 1 1 llllll llllllll llllllllll 
Db 300 CTGCAAACATCATGAAGAGTACCACGAGTCACGCGAATCGAAGGAGCACGAAGAGTACGA 359 

Qy 4492 gaaagaaaatctcga 4506 

llllllll I III 
Db 360 TAAAGAAAAACCCGA 374 



RESULT 2 
US-08-298-687A-17 
Sequence 17, Application US/08298687A 
Patent No. 5521078 
GENERAL INFORMATION: 
APPLICANT: John, Maliyakal E. 
TITLE OF INVENTION: GENETICALLY ENGINEERING COTTON 
TITLE OF INVENTION: PLANTS FOR ALTERED FIBER 
NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Nicholas J. Seay, Quarles 4 Brady 
STREET: P.O. Box 2113, First Wisconsin Plaza 
CITY: Madison 
STATE: Wisconsin 
COUNTRY: USA 
ZIP: 53701 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Microsoft Word 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/298, 687A 

FILING DATE; 

CLASSIFICATION: 800 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/617,239 

FILING DATE: 21-NOV-1990 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/253,243 

FILING DATE: 04-OCM988 
ATTORNEY/AGENT INFORMATION: 

NAME: Seay, Nicholas J. 



REGISTRATION NUMBER: 27,386 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (608) 283-2478 

TELEFAX: (608) 251-5139 
INFORMATION FOR SEQ ID NO: 17: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 1283 base pairs 

TYPE: nucleic acid 

STRANDEDNESS: double 

TOPOLOGY: linear 
MOLECULE TYPE: CDNA 
HYPOTHETICAL: NO 
ANTI -SENSE: NO 
ORIGINAL SOURCE: 

ORGANISM: Gossypium hirsutum 

STRAIN: Coker 312 

DEVELOPMENTAL STAGE: 15 day old fiber cells 
TISSUE TYPE: fiber cells 
IMMEDIATE SOURCE: 
LIBRARY: CKFB15 
CLONE: E9 
US-08-298-687A-17 



Query Match 4.8*; Score 265.4; DB 1; Length 1283; 

Best Local Similarity 84.3%; Pred. No. 9.4e-39; 

Matches 316; Conservative 0; Mismatches 46; Indels 13; Gaps 1; 

Qy 4132 aatacacgttcttttctttctatttgattaaccatggctcatagcattcgtcaccctttc 4191 

i ii i mm Milium i miiiiiiimi i n mi mm 

Db 13 ACTAAAAATTCTTTGCTTTCTATTTIGTAAACCATGGCTCATAACTTTTGTCATCCTTTC 72 
Qy 4192 ttccttttccaacttttactcataagtgtctcactagtgaccggtagccacactgtttcg 4251 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i iiiiiiiiii 1 1 iiiiii mil ii m 

Db 73 TTCCTTTTCCAACTTTTACTCATTACTGTCTCACTAATAATCGGTAGTCACACCGTCTCG 132 

Qy 4252 gcagcggctcgacgtttattcgagacacaagcaacctcatcagagctcccacaattggct 4311 

1 1 1 1 1 1 1 1 1 1 1 1 Mill llllllll MMIMMMIMM IMMMIMII 
Db 133 TCAGCGGCTCGACATTTATTCCAGACACAAACAACCTCATCAGAGCTGCCACAATTGGCT 192 

Qy 4312 tcaaaatacgaaagcacgagagtctgaatacgaaaagccagaatacaaacagccaaagta 4371 

lllllllllllll I II II II I MMMIIMI Mill Jl 

Db 193 TCAAAATACGAAA AGCACAAAGAGTCTGAATACAAACAACCAAAATA 239 

Qy 4372 tcacgaagagtactcaaaacttgagaagcctgaaatgcaaaaggaggaaaaacaaaaacc 4431 

MIIIII Mill llllll llllllllll MUM IIIIIIIIIIIIIMIIMI 
Db 240 TCACGAAAAGTACCCAAAACATGAGAAGCCTAAAATGCACAAGGAGGAAAAACAAAAACC 299 

Qy 4432 ctgcaaacagcatgaagagtaccacgagtcacacgaatcaaaggagcaaaaagagtacga 4491 

llllllll IMMMMMIIIMMIMI llllll llllllll llllllllll 
Db 300 CTGCAAACATCATGAAGAGTACCACGAGTCACGCGAATCGAAGGAGCACGAAGAGTACGA 359 



4492 gaaagaaaatctcga 4506 

llllllll I III 
360 TAAAGAAAAACCCGA 374 



RESULT 3 
US-08-530-797-18 

Sequence 18, Application US/08530797 

Patent No. 5597718 

GENERAL INFORMATION: 
APPLICANT: John, Maliyakal E. 
APPLICANT: Umbeck, Paul F, 
APPLICANT: Brill, Winston J. 

TITLE OF INVENTION: GENET ICALY ENGINEERED COTTON PLANTS 
TITLE OF INVENTION; FOR ALTERED FIBER 
NUMBER OF SEQUENCES: 18 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Quarles and Brady 

STREET: P.O BOX 2113 

STREET: FIRST WISCONSIN PLAZA 

CITY: MADISON 
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MM I III I I III I M III I 
Db 5555 MTATCATATAATAATAATAAATTAGTTAAAATAATAGTATTTCATACAAAATACTAACT 5614 

Qy 2929 taacatataatattaataatttaatcataattatactttggtgaatgtgacagtggggag 2988 

II I II III III III llll II III I III I I I I 

Db 5615 TATAAGTATATCATATAATATTATATATATATATATTTATGTGTTTTTGATTGGGTGTAT 5674 

Qy 2989 atacgtaaagtattttaacattatactttttgcaagcagttggctggtctacccaagagt 3048 

III I II I II MM I I I II I I 

Db 5675 ATAAGGCTATAAGTATATATGGGTTGTTCATTATATATTTATATGTGAATAGATACATAT 5734 

Qy 3049 gatcaaa gtttgagctgccttcaatgagccaatttttgcccataatggataaa 3101 

I II llll I II II II I III II III 
Db 5735 AAGTTAATATATTTATTTGTGTATATGTCTGTGTTAAGATAGATATGCATTACAGTTAAG 5794 

Qy 3102 ggcaatttgtttagttcaactgctcacagaataatgttaaaatgaaattaaaataaggtg 3161 

Db 5795 WTTMACTTTM™ 5854 

Qy 3162 gcctggtcacacacacaaaaaaaaactaatgttggttggttgaattttatattacggaat 3221 

I I III I II I llll I I I I lllll I I II 

Db 5855 TGCATATTACAAGAATAATATTTGTATAAAATATATATATATATATATATATAAAGACAT 5914 

Qy 3222 gtaatattatattttaaaataaaattatgttatttagattcttaatattttggagcattc 3281 

III I llll lllll I II II II I llll I II 

Db 5915 -TAAAACTATACTAATAGGTAATTAGTTTTATTAIATCATCCTTTTATTATTATAATTTT 5973 

Qy 3282 catactataatttcgtaacataatattaaaatatagtaatataaagtgtaattaacttta 3341 

II I I III II II II III llllllll II III II 

Db 5974 TTTTGTTTTACTTCTTGTCGTTCTTTTTTGTTATTATAATATAACAAATATAAAACAATA 6033 

Qy 3342 aattacaagcataatattaaattttgaatcaattaatttttatttctattattttaatta 3401 

I llllllll II I II I III III II 
Db 6034 TCAGTATTTGGAATATAAATAAATTTATTCTACATATATGCATATATATATATATATATA 6093 

Qy 3402 atttagtctattttttcaaaataaaatttaaatctaaataaaaataattttt-*-cctta 3458 

I llllllll I II II I I I I I I llllll I I 
Db 6094 TATATATATATATATATATATATATATATATGTATGATTTTATACTATTTTTATACATGC 6153 

Qy 3459 atgttgaaacaactcatgttatacttcaaaattataagtattatatttaccttgatgatt 3518 

II II II I II lllllll III III I II III I II I I 
Db 6154 ATTTTTATATATTTTAGTATATACTTTAAAGATATTATTAATATTTATATAGTAGCATAT 6213 

Qy 3519 tatttattagtatattaattctgattataattatggtgggatacaatcgctttccactaa 3578 

I III III I I III llll III I llll 
Db 6214 ATGTATTTATATTATAACAAATATTTTCATTTATATAAATATATAGAACATGAACATTTT 6273 

Qy 3579 atattttaactatgatt-tataaatttatttcaacatcgtatatttacttattaatacat 3637 

II II II II III II lllll I llll limillll I II 

Db 6274 ATTAATAACTCATATTTGAATATATATATTTATAATGTGTATTTTTACTTATTTTTTTAT 6333 

Qy 3638 aatttatcataattttatggaaattgagaccaagaaacattaagagaacaaattctataa 3697 

I I II llll II II II I I I I II II llll I I II 
Db 6334 AITATACMTAAMTTTTGAMTTCATAAMTGCATGAAATACATAAAAAAATACAACAA 6393 

Qy 3698 caaagacaatttagaaaaaaat-gtacttttaggtaattttaagtactcttaaccaaaca 3756 

I I I III I I I II I I I llll II III II 
Db 6394 AACAAATGATAAAAACATTTTTATTAATATAATATAATATAATATAATAATATATTTTTC 6453 

Qy 3757 caaaaattcaaatcaaatgaactaaataagataatataacatacggaacatcttacttgt 3816 

I III I I II III lllll II I I II 
Db 6454 CTGTTATTTATTTATCATTTTTTTTTTGATGCIATATATATTATTATATAATAAATTATA 6513 

Qy 3817 aatcttacattcccataattttattatgaaaaataatcttatattactcgaactaaatgt 3876 

I II I II III II II llll II I III I II II 
Db 6514 ATATATA- ■ -ACAACAAAAATTAATAATAATAATATACTACTTTTAAIATAATACAACAA 6570 

Qy 3877 tgtcacaaattattatctaaataaagaaaaacacttaatttttataacattttttcatat 3936 

I I III llllll II III I I I lllll I I III 
Db 6571 TACAAAGAATATGTATCTATATCAATTATATATATATGAATATATAAATATGATAGATAA 6630 

Qy 3937 atttgaaaga 3946 

I II III 



Db 6631 TATAGATAGA 6640 



Search completed: September 3, 2000, 03:05:49 
Job time: 28295 sec 
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Qy 3026 agttggctggtctacccaagagtgatcaaagtttgagctgccttcaatgagccaattttt 3085 

III I I! I II I I I I II II I I III I 
Db 6397 TGTTTTGTTGTATTTTTTTATGTATTTCATGCAHTTATGAATTTCAAMTTTTATTGTA 6338 

Qy 3086 gcccataatggataaaggcaatttgtttagttcaactgctcacagaataatgttaaaatg 3145 

INI I III II I I I II I II III I I 
Db 6337 TAATATAAAAAMTMGTAAAAATACACATTATAAATATATATATTCAAATATGAGTTAT 6278 

Qy 3146 aaattaaaataaggtggcctggtcacacacacaaaaaaaaactaatgttggttggttgaa 3205 

III III II II III III MM llll I I 
Db 6277 TAATAAAATGTTCATGTTCTATATATTTATATAAATGAAAATATTTGTTATAATATAAAT 6218 

Qy 3206 ttttatattacggaatgtaatattatattttaaaataaaattatgttatttagattctta 3265 

Mill II III I II II I III III II I I II II 
Db 6217 ACATATATGCTACTATATAAATATTAATAATATCTTTAAAGTAT-ATACTAAAATAIATA 6159 

Qy 3266 atattttggagcattccatactataatttcgtaacataatattaaaatatagtaatataa 3325 

III III Mill II II llll I Mill Mill 

Db 6158 AAAATGCATGTATAAAAATAGTATAAAATCATACATATATATATATATATATATATATAT 6099 

Qy 3326 agtgtaattaactttaaattacaagcataatattaaattttgaatcaattaatttttatt 3385 

I II I I I II I Mill llll I I III I I 
Db 6098 ATATATATATATATATATATATATGCATATATGTAGAATAAATTTATTTATATTCCAAAT 6039 

Qy 3386 tctattattttaattaatttagtctattttttcaaaataaaatttaaatctaaataaaaa 3445 

II llll I I llll I llll I II I llll I II llll 
Db 6038 ACTGATATTGTTTTATATTTGTTATATTATAATAACAAAAAAGAACGACAAGAAGTAAAA 5979 

Qy 3446 taatttttccttaatgttgaaacaactcatgttatacttcaaaattataagtattatatt 3505 

II II III I II I III I II I I II llll I 
Db 5978 CAAAAAAAATTATAATAATAAAAGGATGATATAATAAAACTAATTACCTATTAGTATAGT 5919 

Qy 3506 taccttgatgatttatttattagtatattaattctgattataatta — tggtgggata 3561 

I II I III III lllll II I II II II I II III 
Db 5918 TTTAAIGTCTTTATAIATATATATATATATATATTTTATACAAATATTATTCTTGTAATA 5859 

Qy 3562 caatcgctttccactaaatattttaactatgatttataaatttatttcaacatcgtatat 3621 
I I II HUM III I II I III III 

Db 5858 TGCATATTGTTAGTTATCTATTTTTTATATATATGTACAAAAAAAAAAAAAAAAAACTAT 5799 

Qy 3622 ttacttattaatacataatttatcataattttatggaaattgagaccaagaaacattaag 3681 

II II I II I I II II Ml II III II 
Db 5798 AACCCTTAACTGTAATGCATATCTATCTTAACACAGACATATACACAAATAAATATATTA 5739 

Qy 3682 agaacaaattctataacaaagacaatttagaaaaaaatgtacttttaggtaattttaagt 3741 

I I I I II I I II I II I II II II II II 
Db 5738 ACTTATATGTATCTATTCACATATAAATATATAATGAACAACCCATATATACTTATAGCC 5679 

Qy 3742 actcttaaccaaacacaaaaattcaaatcaaatgaactaaataagataatataacatacg 3801 

II I I mm II I I III llll II llll I llll 
Db 5678 TTATATACACCCAATCAAAAACACATAAATATATATATATATAATATTATATGATATAC- 5618 

Qy 3802 gaacatcttacttgtaatcttacattcccataattttattatgaaaaataatcttatatt 3861 

1 1 iii i i i i i i ii mm i ii i nun 

Db 5619 TTATAAGTTAGTATTTTGTATGAAATACTATTATTTTAACTAATTTATTATTATTATATG 5560 

Qy 3862 actcgaactaaatgttgtcacaaattattatctaaataaagaaaaacacttaatttttat 3921 

I I I Ml I II I II II I llll I Ml II I I I 

Db 5559 ATATTTAAATATTTTTCAAAAAACTAATCATGTTAATATAATATAATA-TAACATAAATT 5501 

Qy 3922 aacattttttcatatatttgaaagattatattttgtatatttacgtaaaaatatt 3976 

Nil Mill II II I III II llll III I II 
Db 5500 GACATAAAAAAATATAATTTTAAATATTTATATTTATCCTTAATATTAAATTCTT 5446 



RESULT 14 
V22740 

ID V22740 standard; DNA; 3701 BP, 
AC V22740; 

DT 28-SEP-1998 (first entry) 

DE Babesia microti BMNHO antigen sequence. 

KW , antigen; detection; diagnosis; vaccine; tick-borne disease; 



KW differentiation; Lyme disease; ehrlichiosis; ss. 

OS Babesia microti. 

FH Key Location/Qualifiers 

FT CDS 1210. .2599 

FT /*tag- a 

FT /product- antigen 

PN EP-834567-A2. 

PD 08-APR-1998. 

PF 01-OCM997; 117067. 

PR 24-APR-1997; US-845258. 

PR 01-OCT-1996; OS-723142. 

PA (CORI-) CORIXA CORP. 

PI Houghton R, Lodes MJ, Reed SG, Sleath PR; 

DR DPI; 98-195465/18. 

DR P-PSDB; W56290, 

PT Polypeptides coiprising Babesia microti antigens and their 

PT immunogenic frapents or epitopes - and related nucleic acid, 

PT vectors, transformed cells and antibodies, useful for diagnosis of 

PT infection and in protective vaccines 

PS Claim 8; Page 32-35; 113pp; English. 

CC The sequence is that encoding a polypeptide comprising at least 

CC one antigenic portion of a Babesia microti antigen, It can be used 

CC to diagnose B, microti infection by detecting specific antibodies 

CC in usual immunoassays. Infection can also be diagnosed using: 

CC* (a) primers or probes derived from the coding sequence, in 

CC standard amplification or hybridisation tests, or (b) using 

CC antibodies to detect the corresponding antigen . It is also 

CC useful in vaccines to protect against infection, especially 

CC when formulated with an adjuvant. The new diagnostic methods 

CC allow rapid differentiation between B. microti infection and 

CC other tick-borne diseases (Lyme disease and ehrlichiosis) that 

CC have similar symptoms but require different treatments. 

SQ Sequence 3701 BP; 1458 A; 457 C; 492 G; 1294 T; 



Query Match 1.8%; Score 101; DB 1; Length 3701; 

Best Local Similarity 47. 4»; Pred. No. 6,3e-06; 

Matches 398; Conservative 0; Mismatches 435; Indels 6; Gaps 3; 

Qy 3208 ttatattacggaatgtaatattatattttaaaataaaattatgttatttagattcttaat 3267 

iiiiin i ii mm i m i n m m 11 i i - 

Db 369 TTATATTCATGTGGTTATAATTATAAAAGTATATATAGTTTTGTAATTGTAATGATATAA 428 

Qy 3268 attttggagcattccatactataatt — tcgtaacataatattaaaatatagtaatat 3323 

Ml II II llllll I II III! Ill lllll I Ml 
Db 429 AATTAGAACAGATATAATTAATAATTCAAATATTATATTAATTTTATTATATATGATTAT 488 

Qy 3324 aaagtgtaatta-actttaaattacaagcataatattaaattttgaatcaattaattttt 3382 

I I! Ill I I I I! II III I I MM II II 

Db 489 TATTGATATTTATATAATTACATATTGTTATTGTATCATTTAATGATTATATATCAATAT 548 

Qy 3383 atttctattattttaattaatttagtctattttttcaaaataaaatttaaatctaaataa 3442 

I II! ! llll II I I II II II I II I I I I I 

Db 549 CCATATATATATATAATAATTGAATTATAATTAAATTAATTGGCATATTACATTTATAAT 608 

Qy 3443 aaataatttttccttaatgttgaaacaactcatgttatacttcaaaattataagtattat 3502 

II III II I llll I II I II llll MM III I I II I 

Db 609 AATATATTATTAGTCAATATGACATCATATTATATTATCCATCATGATTGTGAATGTAAC 668 

Qy 3503 atttaccttgatgatt-tatttattagtatattaattctgattataattatggtgggata 3561 

II lllll III llll I I IIIMIII limilllll II 
Db 669 TAGAACATTGATTATTATATTAAATCACATATTAATACTGATTATAATAATATCATTGAT 728 

Qy 3562 caatcgctttccactaaatattttaactatgatttataaatttatttcaacatcgtatat 3621 

I I Mill I II III II II I II Mill 
Db 729 AATCTAATAATATAGTATTATCTCTAATAATATTGTATTATCTCTAATATTATGGTATAA 788 

Qy 3622 ttacttattaatacataatttatcataattttatggaaattgagaccaagaaacattaag 3681 

llll MM II I M HIM M I m 

Db 789 TAGATACTGTGAAAATAAATTCAACTGGAGATAAGGAAACCATTTTGTATAGATATTTTA 848 



3682 agaacaaattctataacaaagacaatttagaaaaaaatgtacttttaggtaattttaagt 3741 

III II II III I I Ml I III II I II 
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FLT 10 
)55 

T70055 standard; cDNA; 1283 BP. 
T70055; 

20- AUG-1997 (first entry) 

Cotton fibre specific cDNA clone E9. 

cotton; E6; fibre; promoter; transgenic plant; truncated; 

heterologous gene expression; ds. 

Gossypium hirsutum strain Coker 312. 

US5620882-A. 

15-APR-1997. 

04-OCT-1988; 253243. 

04-OCT-1988; OS-253243. 

21- NOV-1990; US-617239. 

18- MAY-1992; OS-885970. 

19- OCT-1994; US-298829. 
(CETU ) AGRACETUS INC. 
John M; 

WPI; 97-235185/21. 

DNA constructs contg. truncated promoter sequence ■ for 
fibre -specific gene expression in cotton plants 
Example 3; Column 45-48; 48pp; English. 
T70040-57 are cotton fibre- specific cDNA clones which can be used to 
obtain genomic clones containing fibre-specific promoters. Claimed DNA 
constructs comprise a truncated promoter sequence (from one of T70031-38) 
that promotes preferential gene expression in plant fibre cells, a 
protein coding sequence not naturally associated with the promoter 
sequence and a 3' termination sequence, The DNA constructs are useful for 
expressing foreign genes in fibre-producing plants, esp. to produce 
transgenic cotton plants with varied cotton fibre characteristics and 
quality. The present sequence comprises E9 cDNA isolated from clone 
CKFB15-E9 (CK - Coker; FB15 - 15 day old bolls). 

1283 BP; 509 A; 233 C; 251 G; 290 T; 



Query Match 4.84; Score 265.4; DB 1; Length 1283; 

Best Local Similarity 84 .3%; Pred. No, 2,9e-27; 

Matches 316; Conservative 0; Mismatches 46; Indels 13; Gaps 

Qy 4132 aatacacgttcttttctttctatttgattaaccatggctcatagcattcgtcaccctttc 4191 

i ii i nun iiiiiiiii i iiiiiiiiiiiiii i ii mi linn 

Db 13 ACIAAAAATTCTTTGCTTTCTATTTTGTAAACCATGGCTCATAACTTTTGTCATCCTTTC 72 
Qy 4192 ttccttttccaacttttactcataagtgtctcactagtgaccggtagccacactgtttcg 4251 

mnniiiiiiiiimiiii i iiimini 1 1 mm inn n in 

Db 73 TTCCTTTTCCAACTTTTACTCATTACTGTCTCACTAATAATCGGTAGTCACACCGTCTCG 132 

Qy 4252 gcagcggctcgacgtttattcgagacacaagcaacctcatcagagctcccacaattggct 4311 

! 1 1 1 1 ! 1 1 ! 1 1 1 Mill 1 1 f M 1 11 IIIIIIIIIIIIII!! Illlllllllll 
Db 133 TCAGCGGCTCGACATTTATTCCAGACACAAACAACCTCATCAGAGCTGCCACAATTGGCT 192 

Qy 4312 tcaaaatacgaaagcacgagagtctgaatacgaaaagccagaatacaaacagccaaagta 4371 

lllllllllllll I II II II I lllllllllll lllll II 

Db 193 TCAAAATACGAAA AGCACAAAGAGTCTGAATACAAACAACCAAAATA 239 

Qy 4372 tcacgaagagtactcaaaacttgagaagcctgaaatgcaaaaggaggaaaaacaaaaacc 4431 

him inn mill iiiiiiiii mini mimiimmnm 

Db 240 TCACGAAAAGTACCCAAAACATGAGAAGCCTAAAATGCACAAGGAGGAAAAACAAAAACC 299 

Qy 4432 ctgcaaacagcatgaagagtaccacgagtcacacgaatcaaaggagcaaaaagagtacga 4491 

iiiiiiiii mimiiiiiiimmii iiiiii iiiiini ninniii 

Db 300 CTGCAAACATCATGAAGAGTACCACGAGTCACGCGAATCGAAGGAGCACGAAGAGTACGA 359 

Qy 4492 gaaagaaaatctcga 4506 

nilllll I III 

Db 360 TAAAGAAAAACCCGA 374 



RESULT 11 
T43361 

ID T43361 standard; cDNA; 974 E 
AC...T43361; 



m ll-MAR-1997 (first entry) 

DE Cotton FbLate 2-82A gene cDNA clone A8 (FbLate-1). 

KW FbLate; promoter; fibre; transgenic plant; cotton; ds. 

OS Gossypium hirsutum. 

PN WO9639021-A1. 

PD 12-DEC-1996. 

PF 06-JUN-1996; D09449. 

PR 06-JUN-1995; US-467504, 

PA (MONS ) MONSANTO CO. 

PI John ME; 

DR WPI; 97-042726/04, 

PT Plant fibre-specific, developmental^ regulated FbLate promoter * 

pt useful for producing transgenic plants, esp, cotton, with altered 

PT fibre properties 

PS Claim 8; Page 55-56; 79pp; English, 

CC cDNA clones A8 or FbLate-1 (T43361) and All or FbLate-2 (T43362) 

CC correspond to RNAs prevalent in late development of cotton 

CC fibers. They were isolated from a 23-day cotton fibre cDNA 

CC library by screening with 24 -day fibre cDNA. A8 and All are 

CC partial clones of the FbLate 2-82A gene. They can be used to 

CC identify FbLate promoters (see also T43360) useful for fibre- 

CC specific expression of foreign proteins in transgenic plants, esp. 

CC cotton fiber, 

SQ Sequence 974 BP; 388 A; 161 C; 222 G; 203 T; 



Query Match 3.7%; Score 207.8; DB 1; Length 974; 

Best Local Similarity 91.0%; Pred. No. 9.1e-20; 

Matches 232; Conservative 0; Mismatches 22; Indels 1; Gaps 1, 

Qy 4555 agccttgaatcatatgacactggtgcatgtgccatcatcatgcagtaatttcatggtata"?4614 

iiiiii i iininimninniiiiiiiiniiinnmiiiimi in' 

Db 684 AGCCTTAAGCCATATGACACTGGTGCATGTGCCATCATCATGCAGTAATTTCATGGGATA 743 * 
Qy 4615 tcgtaa-tatatagttaataaaaaagatggtgattgggaaatgtgtgtgtgcattcctcc 4673 

i mi iiiii iiiiiiiimiiinim i mi minimi mm m. 

Db 744 TTGTAATTATATTGTTAATAAAAAAGATGGTGAGTGGGAAATGTGTGTGTGCATTCATCC: 803 

Qy 4674 atgcactaatggtgaatctctttgcatacatagaaattctaaatggttatagtttatgtt v 4 7 3 3 

III I llll ! I f 1 1 1 ! 1 1 1 1 1 1 1 1 IIIIII lllll lllllllimilllllll 
Db 804 ATGTAGCAATGCTGAATCTCTTTGCATGCATAGAGATTCTGAATGGTTATAGTTTATGTT 863 ' 

Qy 4734 atagtgtatgttgtagtgaaattaattttaaatgttgtatctaatgttaacatcacttgg' 4793 

III II llll IIIIIIIIIIIIII llllllllll lllllllllllllllllll 

Db 864 ATATCGTTTGTTCTAGTGAAATTAATTTTGAATGTTGTATGTAATGTTAACATCACTTGG 923 



4794 cttgatttatgttat 4808 

lllllllllllll I 
924 CTTGATTTATGTTTT 938 



RESULT 12 
T43362 

ID T43362 standard; cDNA; 645 BP. 

AC T43362; 

DT ll-MAR-1997 (first entry) 

DE Cotton FbLate 2-82A gene cDNA clone All (FbLate-2). 

KW FbLate; promoter; fibre; transgenic plant; cotton; ds. 

OS Gossypium hirsutum. 

PN WO9639021-A1, 

PD 12-DEC-1996, 

PF 06-JUN-1996; U09449. 

PR 06-JUN-1995; US-467504. 

PA (MONS ) MONSANTO CO. 

PI John ME; 

DR WPI; 97-042726/04, 

PT Plant fibre-specific, developmental^ regulated FbLate promoter - 

PT useful for producing transgenic plants, esp. cotton, with altered 

PT fibre properties 

PS Claim 8; Page 56-57; 79pp; English. 

CC cDNA clones A8 or FbLate-1 (T43361) and All or FbLate-2 (T43362) 

CC correspond to RNAs prevalent in late development of cotton 

CC fibers. They were isolated from a 23-day cotton fibre cDNA 
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Qy 4207 ttactcataagtgtctcactagtgaccggtagccacactgtttcggcagcggctcgacgt 4266 

iiiiiiii i iiiiiiiiii iii minium n m iiiiiiiiini i 

Db 61 TTACTCATTACTGTCTCACTAATGATCGGTAGCCACACCGTCTCGTCAGCGGCTCGACAT 120 
Qy 4267 ttattcgagacacaagcaacctcatcagagctcccacaattggcttcaaaatacgaaagc 4326 

iiiiii i linn iiiiiiiimiim iiniiiiiiniiiiiiimiii 

Db 121 TTATTCCAC ACACAAACAACCTCATCAGAGCTGCCACAATTGGCTTCAAAATACGAAA ■ ■ 178 
Qy 4327 acgagagtctgaatacgaaaagccagaatacaaacagccaaagtatcacgaagagtactc 4386 

i mil ii i iiiiiiimimiii inn minimi i 

Db 179 AGCACGAAGAGTCTGAATACAAACAGCCAAAATATCATGAAGAGTACCC 227 

Oy 4387 aaaacttgagaagcctgaaatgcaaaaggaggaaaaacaaaaaccctgcaaacagcatga 4446 

inn 1 1 1 1 1 1 1 r 1 1 1 r m 1 1 i iiiiimmmmimimim inn 

Db 228 AAAACATGAGAAGCCTGAAATGTACAAGGAGGAAAAACAAAAACCCTGCAAACATCATGA 287 

Qy 4447 agagtaccacgagtcacacgaatcaaaggagcaaaaagagtacgagaaagaaaatctcga 4506 

lllllllllllllllll IIIIII IIIIIIII IIIIIIIIII IIIIIIII I III 
Db 288 AGAGTACCACGAGTCACGCGAATCGAAGGAGCACGAAGAGTACGATAAAGAAAAACCCGA 347 



RESULT 6 
T43366 

ID T43366 standard; DNA; 519 BP, 

AC T43366; 

DT ll-MAR-1997 (first entry) 

DE Cotton FbLate 2-82A gene cDNA clone All amplified fragment. 

KW FbLate; promoter; fibre; transgenic plant; cotton; 

KW Gossypium hirsutum; ds. 

OS Synthetic. 

PN WO9639021-A1. 

PD 12-DEC-1996. 

PF 06-JUN-1996; U09449. 

PR 06-JUN-1995; US-467504. 

PA (MONS ) MONSANTO CO. 

PI John ME; 

DR WPI; 97-042726/04. 

PT Plant fibre-specific, developmental ly regulated FbLate promoter • 

PT useful for producing transgenic plants, esp. cotton, with altered 

PT fibre properties 

PS Example 5; Page 63; 79pp; English. 

CC A DNA clone (T43366) was generated by 5 'RACE using primers (see 

CC also T43364-65) based on FbLate2 clone All (T43362), a partial 

CC cDNA clone corresponding to mRNA prevalent in the late development 

CC of cotton fibre. The RACE product showed 91.6% similarity at the 

CC nucleotide level to the genomic clone, FbLate2-82A (see also 

CC T43360). The homology of the RACE product started from nucleotide 

CC position 2269 of the FbLate2-28A gene. The ATG initiation codon 

CC was identified at position 2315 of the gene. 

SQ Sequence 519 BP; 191 A; 127 C; 87 G; 114 T; 



Query Match 4.8%; Score 267; DB 1; Length 519; 

Best Local Similarity 85.6*; Pred. No, 1.9e-27; 

Matches 314; Conservative 0; Mismatches 40; Indels 13; Gaps 1; 

Qy 4140 ttcttttctttctatttgattaaccatggctcatagcattcgtcaccctttcttcctttt 4199 

llllimilllllllll 1 1 1 1 M 1 1 1 1 1 M 1 1 1 I II llll 1 1 1 1 1 1 II 1 1 !! 1 1 
Db 80 TTCTTTTCTTTCTATTTGGTTAACCATGGCTCATAACTTTTGTCATCCTTTCTTCCTTTI 139 

Qy 4200 ccaacttttactcataagtgtctcactagtgaccggtagccacactgtttcggcagcggc 4259 

lllimilllllll I IIIIIIIIII I I IIIIII lllll II III MUM 
Db 140 CCAACTTTTACTCATTACTGTCTCACTAATAATCGGTAGTCACACCGTCTCGTCAGCGGC 199 

Qy 4260 tcgacgtttattcgagacacaagcaacctcatcagagctcccacaattggcttcaaaata 4319 

iiiii mini iiiiiiii iimiiiiiiiim iimmiiiiiniiiii 

Db 200 TCGACATTTATTCCAGACACAAACAACCTCATCAGAGCTGCCACAATTGGCTTCAAAATA 259 

Qy 4320 cgaaagcacgagagtctgaatacgaaaagccagaatacaaacagccaaagtatcacgaag 4379 

lllll I II II II I M 1 1 ! 1 1 1 1 ! I lllll NIMH 

Db 260 CGAAA AGCACAAAGAGTCTGAATACAAACAACCAAAATATCACGAAA 306 



Qy 4380 agtactcaaaacttgagaagcctgaaatgcaaaaggaggaaaaacaaaaaccctgcaaac 4439 

I III IIIIII IIIIIIIIII lllllll 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 11 II I !! 1 1 
Db 307 ACTACCCAAAACATGAGAAGCCTAAAATGCACAAGGAGGAAAAACAAAAACCCTGCAAAC 366 

Qy 4440 agcatgaagagtaccacgagtcacacgaatcaaaggagcaaaaagagtacgagaaagaaa 4499 

I 1 1 1 1 1 1 [ 1 1 1 1 1 1 1 M ! M 1 1 1 IIIIII IIIIIIII IIIIIIIIII lllllll 
Db 367 ATCATGAAGAGTACCACGAGTCACGCGAATCGAAGGAGCACGAAGAGTACGATAAAGAAA 426 

Qy 4500 atctcga 4506 

I I III 
Db 427 AACCCGA 433 



RESULT 7 
T13048 

ID T13048 standard; cDNA; 1283 BP. 
AC T13048; 

DT 27-MAY-1996 (first entry) 
DE Cotton fibre-specific cDNA clone E9. 

Cotton; fibre; promoter; transgenic plant; crop improvement; ds. 
Gossypium hirsutum strain Coker 312. 
US5495070-A. 



KW 



27-FEB-1996 

04-OCT-1988, 
04-OCT-1988, 
21-NOV-1990, 
18-MAM992, 



253243. 
US-253243. 
DS-617239. 
OS-885970. 
(CETO ) AGRACETUS INC. 
John M; 

WPI; 96-139095/14. 

New isolated fibre-specific promoters • used for introducing 
altered fibre-specific characteristics into plants, partic. cotton. 
Example 3; Column 45-46; 48pp; English, 
Cotton cDNA clone E9 (T13048) was isolated from a cDNA library of 
cotton var. Coker 312 15-dayold boll cells using a subtractive ± 
hybridization procedure. The clone hybridises strongly to fiber t> 
RNA and weakly to petal RA. E9 and other fibre-specific cDNA clones 
(see T13033-47 and T13049-T13050) were used to screen cotton genomic 
libraries, leading to the isolation of genomic clones (see T13025-32 
and T13052-53) contg. sequences capable of promoting gene expression 
in fibre cells. 

Sequence 1283 BP; 509 A; 233 C; 251 G; 290 T; 



Query Match 4.8%; Score 265.4; DB 1; Length 1283; 

Best Local Similarity 84.3%; Pred. No. 2,9e-27; 

Matches 316; Conservative 0; Mismatches 46; Indels 13; Gaps 1; 

Qy 4132 aatacacgttcttttctttctatttgattaaccatggctcatagcattcgtcaccctttc 4191 

i ii i mm 1 1 1 m m 1 1 1 i iiiiimiiim i n mi iiiiii 

Db 13 ACTAAAAATTCTTTGCTTTCTATTTTGTAAACCATGGCTCATAACTTTTGTCATCCTTTC 72 

Qy 4192 ttccttttccaacttttactcataagtgtctcactagtgaccggtagccacactgtttcg 4251 

1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 M I M M I I IIIIIIIIII I I IIIIII Mill II III 
Db 73 TTCCTTTTCCAACTTTTACTCATTACTGTCTCACTAATAATCGGTAGTCACACCGTCTCG 132 

Qy 4252 gcagcggctcgacgtttattcgagacacaagcaacctcatcagagctcccacaattggct 4311 

minium mum iiiiiiii iiiiiiiiiiiiiiii iiiiimiiim 

Db 133 TCAGCGGCTCGACATTTATTCCAGACACAAACAACCTCATCAGAGCTGCCACAATTGGCT 192 

Qy 4312 tcaaaatacgaaagcacgagagtctgaatacgaaaagccagaatacaaacagccaaagta 4371 

lllllllllllll I II II II i minimi lllll II 

Db 193 TCAAAATACGAAA AGCACAAAGAGTCTGAATACAAACAACCAAAATA 239 

Qy 4372 tcac'gaagagtactcaaaacttgagaagcctgaaatgcaaaaggaggaaaaacaaaaacc 4431 

iiiiiii mil mm iiiiiiiiii mini nniniiiinnmii 

Db 240 TCACGAAAAGTACCCAAAACATGAGAAGCCTAAAATGCACAAGGAGGAAAAACAAAAACC 299 
Qy 4432 ctgcaaacagcatgaagagtaccacgagtcacacgaatcaaaggagcaaaaagagtacga 4491 

iiiimn iiininnimiimiii iiiiii iiiiiiii iiiinnn 

Db 300 CTGCAAACATCATGAAGAGTACCACGAGTCACGCGAATCGAAGGAGCACGAAGAGTACGA 359 
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Db 61 TATTAAATAATTATTAATTAAAATTTATGGACTTTTGGACTGTCTGACTAATTTTCAGAA 120 

Qy 1866 ttttattttggttttgggttttgttgaattttttagataattattttaaatattctgcat 1925 

Db 121 TTTTATTTTGGTTTTGGGTTTTGTTGAGTTTTTTAGATAATTATTTTAAATATTCTGCAT 180 

Qy 1926 aatttttctgttatttgaaaaggatgttcgaattttttttcaaaattgaaacgtttaaga 1985 

Db 181 AATTTTTCTGTTATTTGAAAAGGATGTTCGAATTTTTTTTCAAAATTGAAACGTTTAAGA 240 

Qy 1986 atttttactactgcaaattcagaataagtgaatttgttttttagaaagattaaataagtt 2045 

Db 241 ATTTTTACTACTGCAAATTCAGAATAAGTGAATTTGTTTTTTAGAAAGATTAAATAAGTT 300 

Qy 2046 agtattacgatttttagtttgatttggtggaaagtaatgtatgtttttgaacataattat 2105 

Db 301 AGTATTACGATTTTTAGTTTGATTTGGTGGAAAGTAATGTATGTTTTTGAACATAATTAT 360 

Qy 2106 ttgacaataattaagttttctagggaa taaacggaaatatcttc * ttcttttttgtaaaa 2164 

iiiiiiiiiiiiiiiiiiiiiiii iiiiiiiiiiiiiiiiiM ii illinium 

Db 361 TTGACAATAATTAAGTTTTCTAGGAAATAAACGGAAATATCTTCTTTTTTTTTTGTAAAA 420 

Qy 2165 ttactaatgcaagaacaaacaacgttttggggagcaaataatctagctttaagtagtcag 2224 

Db 421 TTACTAATGCAAGAACAAACAACGTTTTGGGAAGCAAATAATCTAGCTTTAAGTAGTCAG 480 

Qy 2225 tgtaactctcaaaatctggtcataacttctaggctgagtttgctgtgctacagtagtaag 2284 

Db 481 TGTAACTCTCAAAATCTGGTCATAACTTCTAGGCTGAGTTTGCTGTGCTACAGTAGTAAG 540 

Qy 2285 tctatagaaacttacctgacaaaacgacatgacgtcagggtcgaatctacaacttttcct 2344 

Db 541 TCTAIAGAAACTTACCTGACAAAACGACATGACGTCAGGGTCGAATCTACAACTTTTCCT 600 

Qy 2345 ttttcttcaattaacatatggttgattcaagttccgatctataataatttattacgattt 2404 

Db 601 TTTTCTTCAATTAACATATGGTTGATTCAAGTTCCGATCTATAATAATTTATTACGATTT 660 

Qy 2405 atcaatttcaattaccttatatcatcctattataaatataagtcagttcaattcagtttt 2464 

Db 661 ATCAAITTCAATTACCTTATATCATCCTATTATAAATATAAGTCAGTTCAATTCAGTTTT 720 

Qy 2465 cgaaagttcccaaaaattttgaattttattaaatttattccctaaaaccgaaatagttat 2524 

milium iiiiiiiiiiiiiiiiiimmiiiiimiiiiiimiiii n 

Db 721 CGAAAGTTCCCTAAAATTTTGAATTTTATTAAATTTATTCCCTAAAACCGAAATAGTGAT 780 

Qy 2525 atctttcaaatttaagtttcatttttcaatccgatttcaatttcatccttttataactct 2584 

Db 781 ATCTTTCAAATTTAAGTTTCATTTTTCAATCCGATTTCAATTICAICCTTTTATAACTCT 840 

Qy 2585 ctattatctataattacataaatttcaaattaattttgaaatatttacactttagtccct 2644 

Db 841 CTATGATCTATAATTACATAAATTTCAAACTAATTTTGAAATATATACACTTTAGTCCCT 900 

Qy 2645 aagttcaaaactataaattttcactttagaaattaatcatttttcacatctaagcatcaa 2704 

Db 901 AAGITCAAAACTATAAATTTTCACTTTAGAAATTAATCATTTTTCACATCTAAGCATCAA 960 

Qy 2705 atttaaccaaatgacacaaatttcatgattagttagatcaagcttttgagtcttcaaaac 2764 

Db 961 ATTTAACCAAATGACACAAATTTCATGATTAGTTAGATCAAGCTTTTGAGTCTTCAAAAA 1020 

Qy 2765 ataaaaatt----acaaaaaaaaaacaaacttaaaatcatttatcaatttgaacaacaaa 2820 

Db 1021 CATAAAAATTACAAAAAAAAAAAAACAAACTTAAAATCATTTATCAATTTGAACAACAAA 1080 

Qy 2821 gcttggccgaatgctaagagcttaaaaatggcttcttttgtttctttttgttgcaaacgg 2880 

Db 1081 GCTTGGCCGAATGCTAAGAGCTTAAAAATGGCTTCTTTTGTTTCTTTTTGTTGdAAACGG 1140 

Qy 2881 tggagagaagagggaaatgaagattgaccatatttttttattatgttttaacatataata 2940 

Db .1141 TGGAGAGAAGAGGGAAATGAAGATTGACCATATTTTTXTATTATGTTTTAACATATAATA 1200 



Qy 2941 ttaataatttaatcataattatactttggtgaatgtgacagtggggagatacgtaaagta 3000 

Db 1201 TTAAIAATTTAATCATAATTATACTTTGGTGAATGTGACAGTGGGGAGATACGTAAAGTA 1260 

Qy 3001 ttttaacattatactttttgcaagcagttggctggtctacccaagagtgatcaaagtttg 3060 

Db 1261 -TATAACATTATACTTTTTGCAAGCAGTTGGCTGGTCTATCCAAGAGTGATCAAAGTTTG 1319 

Qy 3061 agctgccttcaatgagccaatttttgcccataatggataaaggcaatttgtttagttcaa 3120 

Db 1320 AGCTGCCTTCAATGAGCCAAITTTTGCCCATAATGGATAAAGGCAATTTGTTTAGTTCAA 1379 

Qy 3121 ctgctcacagaataatgttaaaatgaaattaaaataaggtggcctggtcacacacac--- 3177 

Db 1380 CTGCTCACAGAATAATGTTAAAATGAAATTAAAATAAGGTGGCCTGGTCACACACACACA 1439 

Qy 3178 aaaaaaaaactaatgttggttggttgaattttatattacggaatgtaatattatatttta 3237 

Db 1440 AAAAAAAAACTAATGTTGGTTGGITGAATITTATATTACGGAATGTAATGTTATATTTTA 1499 

Qy 3238 aaataaaattatgttatttagattcttaatattttggagcattccatactataatttcgt 3297 

Db 1500 AAATAAAATTATGTTATTTAGAITCTTAATATITT-GAGCATTCCATACTATAATCTCGT 1558 

Qy 3298 a-acataatattaaaatatagtaatataaagtgtaattaactttaaattacaagcataat 3356 

Db 1559 ATACATAATATTAAAATATAGTAATATAAAGTGTAATTAACTTTAAATTACAAGCATAAT 1618 

Qy 3357 attaaattttgaatcaattaatttttatttctattattttaattaatttagtctattttt 3416 

Db 1619 ATTAAATTTTGAATCAATTAATTTTTATTTCTATTATTTTAATTAATTTAGTCTATTTTT 1678 

Qy 3417 tcaaaataaaatttaaatctaaataaaaataatttttccttaatgttgaaacaactcatg 3476 

1 M 1 1 1 f 1 1 1 f 1 1 M 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 

Db 1679 TCAAAATAAAATTTAAATCTAAATAAAAATAATTTTTCCTTAATATT 1725 

Qy 3477 ttatacttcaaaattataagtattatatttaccttgatgatttatttattagtatattaa 3536 

mi i mi i ii i him linn ii ii ir- 

Db 1726 ATTAATAAATTTATTTCAACATCATATATTTACTTATTAATACATAAA 1773 

Qy 3537 ttctgattataattatggtgggatacaatcgctttccactaaatattttaactatgattt 3596 
II I 

Db 1774 TTAT 1777 

Qy 3597 ataaatttatttcaacatcgtatatttacttattaatacataatttatcataattttatg 3656 

Db 1778 AATAATTTATCATAATTTTATG 1799 

Qy 3657 gaaattgagaccaagaaacattaagagaacaaattctataacaaagacaatttagaaaaa 3716 

Db 1800 GAAATTGAGACCAAGAAACATTAAGAG AACAAATTCTATAACAAAGACAATTT AG - T AAA 1858 

Qy 3717 aatgtacttttaggtaattttaagtactcttaaccaaacacaaaaattcaaatcaaatga 3776 

Db 1859 AATGTACTTTTAGGTAATTTTAAGTACTCTTAACCAAACACAAAAATTCAAATCAAATGA 1918 

Qy 3777 actaaataagataatataacatacggaacatcttacttgtaatcttacattcccataatt 3836 

Db 1919 ACCAAATAAGATAATATAACATACAGAATATCCTACTTGTATICTTACATTCCCGTAATC 1978 

Qy 3837 ttattatgaaaaataatcttatattactcgaactaaatgttgtcacaaattattatctaa 3896 

Db 1979 ATATTATGAAAAGTAATA1TATATTACCTGAGCCAAATGCTCTCACAAACTATIATCCAA 2038 

Qy 3897 ataaagaaaaac-acttaatttttataacattttttcatatatttgaaagattatattt 3954 

Db 2039 AAAAAAMTGTTGAATATAATTTTTATAACATTTTTTCATATATTTGCAAGATTATATTT 2098 

Qy 3955 tgtatatttacgtaaaaatatttgacatagattgagcaccttcttaacataatcccacca 4014 

Db 2099 TGTATATTTACGTAAAAAIATTTGACAIAGATTGAACACCTTCTTAACATAATCCCACCA 2158 
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1973 TAAGMTTTITACTACTGCAMTTCAGAATAAGTGMTTTGTTTTTTAGAAAGATTAAAT 2032 

2041 aagttagtattacgatttttagtttgatttggtggaaagtaatgtatgtttttgaacata 2100 

2033 AAGTTAGTATTACGATTTTTAGTTTGATTTGGTGGAAAGTAATGTATGTTTTTGAACATA 2092 

2101 attatttgacaataattaagttttctagggaataaacggaaatatcttcttcttttttgt 2160 

2161 aaaattactaatgcaagaacaaacaacgttttggggagcaaataatctagctttaagtag 2220 
2153 AAAATTACTAATGCAAGAACAAACAACGTTTTGGGGAGCAAATAATCTAGCTTTAAGTAG 2212 
2221 tcagtgtaactctcaaaatctggtcataacttctaggctgagtttgctgtgctacagtag 2280 
2213 TCAGTGTAACTCTCAAAATCTGGTCATAACTTCTAGGCTGAGTTTGCIGTGCTACAGTAG 2272 
2281 taagtctatagaaacttacctgacaaaacgacatgacgtcagggtcgaatctacaacttt 2340 
2273 TAAGTCTATAGAAACTTACCTGACAAAACGACATGACGTCAGGGTCGAATCTACAACTTT 2332 
2341 tcctttttcttcaattaacatatggttgattcaagttccgatctataataatttattacg 2400 
2333 TCCTTTTTCTTCAATTAACATATGGTTGATTCAAGTTCCGATCTATAATAAITTATTACG 2392 
2401 atttatcaatttcaattaccttatatcatcctattataaatataagtcagttcaattcag 2460 
2393 AITTATCAATITCAATTACCTTATATCATCCTATTATAAATATAAGTCAGTTCAATTCAG 2452 
2461 ttttcgaaagttcccaaaaattttgaattttattaaatttattccctaaaaccgaaatag 2520 
2453 TTTTCGAAAGTTCCCAAAAATTTTGAATTTTATTAAATTTATTCCCTAAAACCGAAATAG 2512 
2521 ttatatctttcaaatttaagtttcatttttcaatccgatttcaatttcatccttttataa 2580 
2513 TTATATCTTTCAAATTTAAGTTTCATTTTTCAATCCGATTTCAATTTCATCCTTTTATAA 2572 
2581 ctctctattatctataattacataaatttcaaattaattttgaaatatttacactttagt 2640 
2573 CTCTCTATTATCTATAATTACATAAATTTCAAATTAATTTTGAAATATTTACACTTTAGT 2632 
2641 ccctaagttcaaaactataaattttcactttagaaattaatcatttttcacatctaagca 2700 
2633 CCCTAAGTTCAAAACTATAAATTTTCACTTTAGAAATTAATCATTTTTCACATCTAAGCA 2692 
2701 tcaaatttaaccaaatgacacaaatttcatgattagttagatcaagcttttgagtcttca 2760 
2693 TCAAATTTAACCAAATGACACAAATTTCATGATTAGTTAGATCAAGCTTTTGAGTCTTCA 2752 
2761 aaacataaaaattacaaaaaaaaaacaaacttaaaatcatttatcaatttgaacaacaaa 2820 
2753 AAACATAAAAATTACAAAAAAAAAACAAACTTAAMTCATTTATCAATTTGAACAACAAA 2812 
2821 gcttggccgaatgctaagagcttaaaaatggcttcttttgtttctttttgttgcaaacgg 2880 
2813 GCTIGGCCGAATGCTAAGAGCTTAAAAATGGCTTCTTTTGTTTCTTTTTGTTGCAAACGG 2872 
2881 tggagagaagagggaaatgaagattgaccatatttttttattatgttttaacatataata 2940 
2873 TGGAGAGAAGAGGGAAATGAAGATTGACCATATTTTTTTATTATGTTTTAACATATAATA 2932 
2941 ttaataatttaatcataattatactttggtgaatgtgacagtggggagatacgtaaagta 3000 
2933 TTAATAATTTAATCATAATTATACTTTGGTGAATGTGACAGTGGGGAGATACGTAAAGTA 2992 
3001 ttttaacattatactttttgcaagcagttggctggtctacccaagagtgatcaaagtttg 3060 
2993 TTITAACATTATACTTTTTGCAAGCAGTIGGCTGGTCTACCCAAGAGTGATCAAAGTTTG 3052 
3061 agctgccttcaatgagccaatttttgcccataatggataaaggcaatttgtttagttcaa 3120 



Ob 


3053 


AGCTGCCTTCAATGAGCCAATTTTTGCCCATAATGGATAAAGGCAATTTGTTTAGTTCAA 


3112 


Qy 


3121 


ctgctcacagaataatgttaaaatgaaattaaaataaggtggcctggtcacacacacaaa 


3180 


Db 


3113 


CTGCTCACAGAAIAATGTTAAAATGAAATTAAAATAAGGTGGCCTGGTCACACACACAAA 


3172 


Qy 


3181 


aaaaaactaatgttggttggttgaattttatattacggaatgtaatattatattttaaaa 


3240 


Db 


3173 


AAAAAACTAATGTTGGTTGGTTGAATTTTATATTACGGAATGTAATATTATATTTTAAAA 


3232 


Qy 


3241 


taaaattatgttatttagattcttaatattttggagcattccatactataatttcgtaac 


3300 


Db 


3233 


TAAAATTATGTTATTTAGATTCTTAATATTTTGGAGCATTCCATACTATAATTTCGTAAC 


3292 


Qy 


3301 


ataatattaaaatatagtaatataaagtgtaattaactttaaattacaagcataatatta 


3360 


Db 


3293 


ATAATATTAAAATATAGTAATATAAAGTGTAATTAACTTTAAATTACAAGCATAATATTA 


3352 


Qy 


3361 


aattttgaatcaattaatttttatttctattattttaattaatttagtctattttttcaa 


3420 


Db 


3353 


AATTTTGAATCAAITAATTTTTATTTCTATTATTTTAATTAATTTAGICTATTTTTTCAA 


3412 


Qy 


3421 


aataaaatttaaatctaaataaaaataatttttccttaatgttgaaacaactcatgttat 


3480 


Db 


3413 


AATAAAATTTAAATCTAAATAAAAATAATTTTTCCTTAATGTTGAAACAACTCATGTTAT 


3472 


Qy 


3481 


acttcaaaattataagtattatatttaccttgatgatttatttattagtatattaattct 


3540 


Db 


3473 


ACTTCAAAATTATAAGTATTATATTTACCTTGATGATTTATTTATTAGTATATTAATTCT 


3532 


Qy 


3541 


gattataattatggtgggatacaatcgctttccactaaatattttaactatgatttataa 


3600 


Db 


3533 


GATTATAATTATGGTGGGATACAATCGCTTTCCACTAAATATTTTAACTATGATTTATAA 


3592 


Qy 


3601 


atttatttcaacatcgtatatttacttattaatacataatttatcataattttatggaaa 




Db 


3593 


ATTTATTTCAACATCGTATATTTACTTATTAATACATAATTTATCATAATTTTATGGAAA 


3652 


Qy 


3661 


ttgagaccaagaaacattaagagaacaaattctataacaaagacaatttagaaaaaaatg 




Db 


3653 


TTGAGACCAAGAAACATTAAGAGAACAAATTCTATAACAAAGACMTTTAGAAAAAAATG 




Qy 


3721 


tacttttaggtaattttaagtactcttaaccaaacacaaaaattcaaatcaaatgaacta 




Db 


3713 


TACTTTTAGGTAATTTTAAGTACTCTTAACCAAACACAAAAATTCAAATCAAATGAACTA 


3772 


Qy 


3781 


aataagataatataacatacggaacatcttacttgtaatcttacattcccataattttat 




Db 


3773 


AATAAGATAATATAACATACGGAACATCTTACTTGTAATCTTACATTCCCATAATTTTAT 




Qy 


3841 


tatgaaaaataatcttatattactcgaactaaatgttgtcacaaattattatctaaataa 


3900 


Db 


3833 


TATGAAAAATAATCTTATATTACTCGAACTAAATGTTGTCACAAATTATTATCTAAATAA 




Qy 


3901 


agaaaaacacttaatttttataacattttttcatatatttgaaagattatattttgtata 


3960 


Db 


3893 


AGAAAAACACTTAATTTTTATAACATTTTTTCATATATTTGAAAGATTATATTTTGTATA 


3952 


Qy 


3961 


tttacgtaaaaatatttgacatagattgagcaccttcttaacataatcccaccataagtc 




Db 


3953 


TTTACGTAAAAATATTTGACATAGATTGAGCACCTTCTTAACATAATCCCACCATAAGTC 




Qy 


4021 


aagtatgtagatgagaaattggtacaaacaacgtggggccaaatcccaccaaaccatctc 




Db 


4013 


AAGTATGIAGATGAGAAATTGGTACAAACAACGTGGGGCCAAATCCCACCAAACCATCTC 


4072 


Qy 


4081 


tcattctctcctataaaaggcttgctacacatagacaacaatccacacacaaatacacgt 


4140 


Db 


4073 


TCATTCICTCCTATAAAAGGCTTGCTACACATAGACAACAATCCACACACAAATACACGT 


4132 


Qy 


4141 


tcttttctttctatttgattaaccatggctcatagcattcgtcaccctttcttccttttc 


4200 


Db 


4133 


TCTTTTCTTTCTATTTGATTAACCATGGCTCATAGCATTCGTCACCCTTTCTTCCTTTTC 


4192 
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841 TATGAAAAATAATCTTATATTACTCGAACTAAATGTTGTCACAAATTATTATCTAAATAA 3900 

901 agaaaaacacttaatttttataacattttttcatatatttgaaagattatattttgtata 3960 

901 AGAAAAACACTTAATTTTTATAACATTTTTTCATATATTTGAAAGATTATATTTTGTATA 3960 

961 tttacgtaaaaatatttgacatagattgagcaccttcttaacataatcccaccataagtc 4020 

961 TTTACGTAAAAATATTTGACATAGATTGAGCACCTTCTTAACATAATCCCACCATAAGTC 4020 

021 aagtatgtagatgagaaattggtacaaacaacgtggggccaaatcccaccaaaccatctc 4080 

021 AAGTATGTAGATGAGAAATTGGTACAAACAACGTGGGGCCAAATCCCACCAAACCATCTC 4080 

081 tcattctctcctataaaaggcttgctacacatagacaacaatccacacacaaatacacgt 4 140 

081 TCATTCTCTCCTATAAAAGGCTTGCTACACATAGACAACAATCCACACACAAATACACGT 4140 

141 tcttttctttctatttgattaaccatggctcatagcattcgtcaccctttcttccttttc 4200 

141 TCITTTCTTTCTATTTGATTAACCATGGCTCATAGCATTCGTCACCCTTTCTTCCTTTTC 4200 

201 caacttttactcataagtgtctcactagtgaccggtagccacactgtttcggcagcggct 4260 

201 CAACITTTACTCATAAGTGTCTCACTAGTGACCGGTAGCCACACTGTTTCGGCAGCGGCT 4260 

261 cgacgtttattcgagacacaagcaacctcatcagagctcccacaattggcttcaaaatac 4320 

261 CGACGTTTATTCGAGACACAAGCAACCTCAICAGAGCTCCCACAATTGGCTTCAAAATAC 4320 

321 gaaagcacgagagtctgaatacgaaaagccagaatacaaacagccaaagtatcacgaaga 4380 

321 GAAAGCACGAGAGTCTGAATACGAAAAGCCAGAATACAAACAGCCAAAGTATCACGAAGA 4380 

381 gtactcaaaacttgagaagcctgaaatgcaaaaggaggaaaaacaaaaaccctgcaaaca 4440 

381 GTACTCAAAACTTGAGAAGCCTGAAATGCAAAAGGAGGAAAAACAAAAACCCTGCAAACA 4440 

441 gcatgaagagtaccacgagtcacacgaatcaaaggagcaaaaagagtacgagaaagaaaa 4500 

441 GCATGAAGAGTACCACGAGTCACACGAATCAAAGGAGCAAAAAGAGTACGAGAAAGAAAA 4500 

501 tctcgacgaattcccccgggcgtcgacggctagcgaagatcttcgggcccgtcgagcctt 4560 

501 TCTCGACGAATTCCCCCGGGCGTCGACGGCTAGCGAAGATCTTCGGGCCCGTCGAGCCTT 4560 

561 gaatcatatgacactggtgcatgtgccatcatcatgcagtaatttcatggtatatcgtaa 4620 

561 GAATCATATGACACTGGTGCATGTGCCATCATCATGCAGTAATTTCATGGTATATCGTAA 4620 

621 tatatagttaataaaaaagatggtgattgggaaatgtgtgtgtgcattcctccatgcact 4 680 

621 TATATAGTTAATAAAAAAGATGGTGATTGGGAAATGTGTGTGTGCATTCCTCCATGCACT 4680 

681 aatggtgaatctctttgcatacatagaaattctaaatggttatagtttatgttatagtgt 4740 

681 AATGGTGAATCTCTTTGCATACATAGAAATTCTAAATGGTTATAGTTTATGTTATAGTGT 4740 

741 atgttgtagtgaaattaattttaaatgttgtatctaatgttaacatcacttggcttgatt 4800 

741 ATGTTGTAGTGAAATTAATTTTAAATGTTGTATCTAATGTTAACATCACTTGGCTTGATT 4800 
801 tatgttatgttatgtattttactttaatgatattgcatgtattgttaatttaacattgct 4860 
801 TATGTTATGTTATGTATTTTACTTTAATGATATTGCATGTATTGTTAATTTAACATTGCT 4860 
861 tgatcattatactcttctactattaattataaatggcactgttttgtttaaactttttac 4920 
861 TGATCATTATACICTTCTACTATTAATTATAAATGGCACTGTTITGTITAAACTTTTTAC 4920 
921 aagttaagacatgtataaatatatgacaatataattacaggttttagttcaatgttagct 4980 
921 AAGTTAAGACAIGTATAAATATATGACAATATAATTACAGGTTTTAGTTCAATGTTAGCT 4980 



Qy 


4981 


atcttagtatgttattgatgatcttaattacatttaaacaaattccacttaaaattttaa 5040 


Db 


4981 


ATCTTAGTATGTTATTGATGATCTTAATTACATTTAAACAAATTCCACTTAAAATTTTAA 504 0 


Qy 


5041 


taaataataacaaataattattgtaatataatacattaaatgcaacaaaaaatgaaataa 


5100 


Db 


5041 


TAAATAATAACAAATAATTATTGTAATATAATACATTAAATGCAACAAAAAATGAAATAA 


5100 


Qy 


5101 


ataaaataaaatagcaaataattgttataatattgtaatataatatgtaccatattctta 


5160 


Db 


5101 


ATAAAATAAAATAGCAAATAATTGTTATAATATTGTAATATAATATGTACCATATTCTTA 


5160 


Qy 


5161 


actgaaatagggtctaacctataatccctaaaatttcagtttaaatatttttatacctac 


5220 


Db 


5161 


ACTGAAATAGGGTCTAACCTATAAICCCIAAAATITCAGTTTAAATATTTITAIACCTAC 


5220 


Qy 


5221 


catattattagaactctttttaaatatattaaaattttaattataccaatttaattaaac 


5280 


Db 


5221 


CATATTATTAGAACTCTTTTTAAATATATTAAAATTTTAATTATACCAATTTAATTAAAC 


5280 


Qy 


5281 


tattaattatcttaactaaaatctaaaattttatttaacctattaataaattcctaatta 


5340 


Db 


5281 


TAT1AATTATCTTAACTAAAATCTAAAATITTATTTAACCIATTAATAAATTCCTAATTA 


5340 


Qy 


5341 


tcttatctaatttaaaactctaattatcctaatttaatttaaattcttaattatcttaat 


5400 


Db 


5341 


TCTTATCTAAITTAAAACTCTAATTATCCTAATTTAATTTAAATTCTTAATTATCTTAAl 


5400 


Qy 


5401 


ttgtaacctcctccacccagctagatgctggacccgaatccgggagattacatcggccat 


5460 


Db 


5401 


TTGTAACCTCCTCCACCCAGCTAGATGCTGGACCCGAATCCGGGAGATTACATCGGCCAT 


5460 


Qy 


5461 


tgagatggcgtgatcagggtttggcgcgccggtacccaattcgccctatagtgagttcgt 


5520 


Db 


5461 


TGAGATGGCGTGATCAGGGTTTGGCGCGCCGGTACCCAATTCGCCCTATAGTGAGTTCGT 


5520 


Qy 

Db 


5521 
5521 


attacgcgcgctcactgcgtccggttt 5547 

ATTACGCGCGCTCACTGCGTCCGGTTT 5547 ; 





RESULT 2 
T73870 

ID T73870 standard; DNA; 5518 BP, 

AC T73870; 

DT 26-JAN-1998 (first entry) 

DE Cotton fibre promoter clone 4-4(6) construct, pCGN5610 (Version II). 

KW promoter; fibre-specific; transcriptional factor; promoter; 

KW altered phenotype; colour; melanin; indigo; ss. 

OS Gossypium hirsutum cv. coker 130. 

PN WO9640924-A2, 

PD 19-DEC-1996. ■ 

PF 07-JUN-1996; U09897. 

PR 07-JUN-1995; US-480178. 

PR 01-JUI-1996; ZA-005572. 

PA (CALJ ) CALGENE INC. 

PI Mcbride K, Pear JR, Perez-Grau L, Stalker DM; 

DR WPI; 97-052325/05. 

PT dna construct contg. gene of interest controlled by cotton fibre 

PT transcriptional factor • used to produce altered phenotype cotton 

PT fibre cells expressing genes affecting pigmentation 

PS Example 5; Fig 3A-J; 95pp; English, 

CC The present sequence is a 4-4 cotton fibre expression cassette (version 

CC II) from promoter construct pCGN5610, The lambda genomic phage clone used 

CC to form this construct was designated 4-4(6), DNA constructs containing 

CC cotton fibre-specific transcriptional factor promoters are useful to 

CC produce cotton fibre cells with altered phenotype, especially altered 

CC colour. Genes involved in the production of melanin (e.g. tyrosinase 

CC gene and ORF438 encoded protein from Streptomyces antibioticus) and 

CC indigo (mono-oxygenase genes possibly in conjunction with a 

CC tryptophanase gene) are of interest. The promoters of the invention are 

CC reliable and permit expression of a protein selectively in cotton fibre 
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PF 07-JUN-1996; U09897. 

PR 07-JUN-1995; US-4B0178. 

PR 01-JDL-1996; ZA-005572. 

PA (CALJ ) CALGENE INC. 

PI Mcbride K, Pear JR, Perez -Grau L, Stalker DM; 

DR WPI; 97-052325/05. 

DR P-PSDB; W21899 . 

PT DNA construct contg. gene of interest controlled by cotton fibre 

PT transcriptional factor • used to produce altered phenotype cotton 

PT fibre cells expressing genes affecting pigmentation 

PS Claim 22; Fig 2A-J; 95pp; English. 

CC The present sequence is a 4-4 cotton fibre expression cassette (version 

CC I) from promoter construct pCGN5606. The lambda genomic phage clone used 

CC to form this construct was designated 4-4(6), DNA constructs containing 

CC cotton fibre-specific transcriptional factor promoters are useful to 

CC produce cotton fibre cells with altered phenotype, especially altered 

CC colour, Genes involved in the production of melanin (e.g. tyrosinase 

CC gene and ORF438 encoded protein from Streptomyces antibioticus) and 

CC indigo (mono-oxygenase genes possibly in conjunction with a 

CC tryptophanase gene) are of interest, The promoters of the invention are 

CC reliable and permit expression of a protein selectively in cotton fibre 

CC to affect qualities such as fibre strength, length, colour and dyability 

CC as required. The construct and methods can also be used for the 

CC introduction of other advantageous genes into a cotton plant, e.g. a 

CC plant hormone. In particular, fibres from a plant producing coloured 

CC fibres may be used to produce yarns and/or fabrics that do not require 

CC dyeing , 

SQ Sequence 5547 BP; 1889 A; 808 C; 822 G; 2028 T; 



Query Match 100.0%; Score 5547; DB 1; Length 5547; 

Best Local Similarity 100. 0%; Pred. No. 0; 



Matches 


5547; Conservative 0; Mismatches 0; Indels 0; G 


ps 


Qy 


1 


actaaagggaacaaaagctggagctccaccgcggtggcggccgctctagaactagtggat 


60 


Db 


1 


ACTAAAGGGAACAAAAGCTGGAGCTCCACCGCGGTGGCGGCCGCTCTAGAACTAGTGGAT 


60 


Qy 


61 


cccccgtggactaaacaaaacatgggaagatttgctgtaaaaaaataaaagaagcttact 


120 


Db 


61 


CCCCCGTGGACTAAACAAAACATGGGAAGATTTGCTGTAAAAAAATAAAAGAAGCTTACT 


120 


Qy 


121 


caataacactttgtgaattgtatacaaaagactcaatgaaaaacaataactcaatacact 


180 


Db 


121 


CAATAACACTTTGTGAATTGTATACAAAAGACTCAATGAAAAACAATAACTCAATACACT 


180 


Qy 


181 


ttttttcactgatttacatcctttatataggctgaaactacaacaactttagctaaaaaa 


240 


Db 


181 


TTTTTTCACTGATTTACATCCTTTATATAGGCTGAAACTACAACAACTTTAGCTAAAAAA 


240 


Qy 


241 


ataggataacctaatagcaaaatcacaatcagatattaaaccatgattttagctaaccat 


300 


Db 


241 


ATAGGATAACCTAATAGCAAAATCACAATCAGATATTAAACCATGATTTTAGCTAACCAT 


300 


Qy 


301 


ttaacaactttattgaaactaatttgaatatttcatctgctgatatgcccaagattttag 


360 


Db 


301 


TTAACAACTTTATTGAAACTAATTTGAATATTTCATCTGCTGATATGCCCAAGATTTTAG 


360 


Qy 


361 


gccactaaccgatttggtggtgaactttaacatgtcatgcatttgtaactgtttgaaaca 


420 


Db 


361 


GCCACTAACCGATTTGGTGGTGAACTTTAACATGTCATGCATTTGTAACTGTTTGAAACA 


420 


Qy 


421 


agttttttgcattattttactatatgaactgtttgattaggttgagttacacactgagct 


480 


Db 


421 


AGTTTTTTGCATTATTTTACTATATGAACTGTTTGATTAGGTTGAGTTACACACTGAGCT 


480 


Qy 


481 


tgtaagctcactcaaatttttctaatttctaaggtgatcagcaaacttaggaccgggcgg 


540 


Db 


481 


TGTAAGCTCACTCAAATTTTTCTAATTTCTAAGGTGATCAGCAAACTTAGGACCGGGCGG 


540 


Qy 


541 


cgtacgagagctcggattgattttctagttaataaataagacgatttatgtttttaaact 


600 


Db 


5.41 


CGTACGAGAGCTCGGATTGATTTTCTAGTTAATAAATAAGACGATTTATGTTTTTAAACT 


600 



Qy 


601 attatggactttttggactatgtaactgtttgggactttatttttgttttttatttgctt 


660 


Db 


601 ATTATGGACTTTTTGGACTATGTAACTGTTTGGGACTTTATTTTTGTTTTTTATTTGCTT 


660 


Qy 


661 


tttttggatttagtaattattatttttaaactgcaaaattatatgtttttacaaactaag 


720 


Db 


661 


TTTTTGGATTTAGTAATTATTATTTTTAAACTGCAAAATTATATGTTTTTACAAACTAAG 


720 


Qy 


721 


tcacagttttcaaaattccataacttagaatttttcgctgcaaaataaagtaatcattta 


780 


Db 


721 


TCACAGTTTTCAAAATTCCATAACTTAGAATTTTTCGCTGCAAAATAAAGTAATCATTTA 


780 


Qy 


781 


agtgttttttctgtaataaaataaataaataattttaacgagtattttcctaaaaattgg 


840 


Db 


781 


AGTGTTTTTTCTGTAATAAMTAAATAAATAATTTTAACGAGTATTTTCCTAAAAATTGG 


840 


Qy 


841 


aaattgatttaccaaaattagtatgtcaaaacacatgtttatatgttacagggcgatatc 


900 


Db 


841 


AAATTGATTTACCAAAATTAGTATGTCAAAACACATGTTTATATGTTACAGGGCGATATC 


900 


Qy 


901 


gtctaggcaaataacatctaggcggggtttggagtgttacagggcgagtgggctcatttt 


960 


Db 


901 


GTCTAGGCAAATAACATCTAGGCGGGGTTTGGAGTGTTACAGGGCGAGTGGGCTCATTTT 


960 


Qy 


961 


gagtaagtatagttagggccgagttttagattgcatattcaaggtcaaagattttgtaaa 


1020 


Db 


961 


GAGTAAGTATAGTTAGGGCCGAGTTTTAGATTGCATATTCAAGGTCAAAGATTTTGTAAA 


1020 


Qy 


1021 


cttcgatgaatgatatgtatgattgtccgattaacgaaatatgtttttttcttttgtgtg 


1080 


Db 


1021 




1080 


Qy 


1081 


tgttttatctcgtgtgataagtatatagtatgttttattccaattcttatggcatgtgac 


1140 


Db 


1081 


TGTTTTATCTCGTGTGATAAGTATATAGTATGTTTTATTCCAATTCTTATGGCATGTGAC 


1140 


Qy 


1141 


attgtggctattctaattaaattgatttgttattattgaaatctgatgcatctgttcta'c 


1200 


Db 


1141 


ATTGTGGCTATTCTAATTAAATTGATTTGTTATTATTGAAATCTGATGCATCTGTTCTAC 


1200 


Qy 


1201 


aaagcatggaatctcatgcctactgctttctgttaaagatacgattgcaagtttaacatg 


1260 


Db 


1201 


AAAGCATGGAATCTCATGCCTACTGCTTTCTGTTAAAGATACGATTGCAAGTTTAACATG 


1260 


Qy 


1261 


cttactattttgattttgtccttgcatgctatgtcacattacatggggttgggatgatat 


1320 


Db 


1261 


CTTACTATTTTGATTTTGTCCTTGCATGCTATGTCACATTACATGGGGTTGGGATGATAT 


1320 


Qy 


1321 


ggtaaggaggaagttttgacagtttaatgatttgcactatctggtggtttaaccacatat 


1380 


Db 


1321 


GGTAAGGAGGAAGTTTTGACAGTTTAATGATTTGCACTATCTGGTGGTTTAACCACATAT 


1380 


Qy 


1381 


ttgttatggcatcttgactgcggttatggtggctcgaccgcccatatctgttctggaaat 


1440 


Db 


1381 


TTGTTATGGCATCTTGACTGCGGTTATGGTGGCTCGACCGCCCATATCTGTTCTGGAAAT 


1440 


Qy 


1441 


ttatctgtgactctggtggcattgtctacaattatttgttggtgtgttttggatggacga 


1500 


Db 


1441 


TTATCTGTGACTCTGGTGGCATTGTCTACAATTATTTGTTGGTGTGTTTTGGATGGACGA 


1500 


Qy 


1501 


gtcgtggggaactctatttggtgtgttgcggagttgggtaggaaattttcgaaaaaaatt 


1560 


Db 


1501 


GTCGTGGGGAACTCTATTTGGTGTGTTGCGGAGTTGGGTAGGAAATTTTCGAAAAAAATT 


1560 


Qy 


1561 


tgcattgtgtttttctgaaaaatattgcattaacataatcatgcattctcaattttggtc 


1620 


Db 


1561 


TGCATTGTGTTTTTCTGAAAAATATTGCATTAACATAATCATGCATTCTCAATTTTGGTC 


1620 


Qy 


1621 


aattgaacgttataaaattctctatgatatcctgatctgtttattacattatatgtgttt 


1680 


Db 


1621 


AATTGAACGTTATAAAATTCTCTATGATATCCTGATCTGTTTATTACATTATATGTGTTT 


1680 
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Db 18155 TTTTTAAAAAATTTTTTAAAAAAATTGAAAAATAAATAAATTATATTTCATTATAAAATT 180! 

Qy 2152 cttttttgtaaaattactaatgcaagaacaaacaacgttttggggagcaaataatctagc 221: 

III I llllll III II II I II INI 
Db 18095 TATTTATTAAAAATTTTTT6TTTATTTTTTAAAAAACATGATTTTATTATATAAATATTT 180: 

Qy 2212 tttaagtagtcagtgtaactctcaaaatctggtcataacttctaggctgagtttgctgtg 22' 

Nil II I II I I I II I I I II II III 
Db 18035 TTTA--TAAAAATAATACATTTAAGAAATTTTTAAAAAATTTATATTAAATTATTTAAAT 17 1 

Qy 2272 ctacagtagtaagtctatagaaacttacctgacaaaacgacatgacgtcagggtcgaatc 23: 

I I Mil I I II I I I II I I I I 
Db 17977 AATTTAATTTTTCTATATATATATATATATTATAIAAATATTCAATAATATATAAATTTA 17' 

Qy 2332 tacaacttttcctttttcttcaattaacatatggttgattcaagttccgatctataataa 23! 

ii i 1 1 ii ii inn mi ii n ii n n mm 

Db 17917 TAAATATATAATAATTAATTAAATTATTATATTATTTATATAAATT*AAATTAATAATAA 178! 

Qy 2392 tttattacgatttatcaatttcaattaccttatatcatcctattataaatataagtcagt 24! 

llllll II II II lllllll I I II II I III I 
Db 17858 ATAAATATGAGAATATAAATTTTTATAAATTATA1CTACATTTTTAAATTTTAAAATTTT 17' 

Qy 2452 tcaattcagttttcgaaagttcccaaaaattttgaattttattaaatttattccctaaaa 25] 

I I II I II I II II II II III II II II III I 

Db 17798 TTATTTAAATTATTAGATATATAATAATATATTAAATATTTATATATATATAAATATCTA 17' 

Qy 2512 ccgaaatagttatatctttcaaatttaagtttcatttttcaatccgatttcaatttcatc 25] 

I II II I I III II I II III I II 
Db 17738 TTAATTTATAATTAGTATATAGTTTTTTTTTAAAAAAAAAATTATTTTTTTTAAAAAATT l?i 

Qy 2572 cttttataactctctattatctataattacataaatttcaaattaattttgaaatattta 26: 

mi iii i i ii mi i ii mm in n i mi i 

Db 17678 TTTTTTTAAAAATGAAAAATAAATAAAT - - - TATATTTCATT ATAAAATTT ATTTATT AA 17' 
Qy 2632 cactttagtccctaagttcaaaactataaattttcactttagaaattaatcatttttcac 26! 

i in i ii ii iii ii iii i i ii mi in 

Db 17621 AAATTTTTTGTTTATTTTTTAAAAAACATGATTTTATTATATAAATA TTTTTT 171 

Qy 2692 atctaagcatcaaatttaaccaaatgacacaaatttcatgattagttagatcaagctttt 27! 

II II I I llllll II I I III II II II III I II III 
Db 17568 ATAAAAATAATACATTTAAGAAATTTTTAAAAAATTTATATTAAATTATTTAAATAATTT 17! 

Qy 2752 gagtcttcaaaacataaaaattacaaaaaaaaaacaaacttaaaatcatttatcaatttg 28] 

I I III I I III I II I III III Ml I II II I 
Db 17508 AATTTTTCTATATATATATATATATTATAAAIATTCAAIAATATATAAATIIAIAAATAT 17/ 

Qy 2812 aacaacaaagcttggccgaatgctaagagcttaaaaatggcttcttttgtttctttttgt 28' 

III II I I II I lllll I III II I II I 
Db 17448 ATAATAATTAATTAAATTATTAAAAAAAAAAAAAAAAAAAAATAAITTTTTATTATTAAT 17: 

Qy 2872 tgcaaacggtggagagaagagggaaatgaagattgaccatatttttttattatgttttaa 29: 

I I I II llll II II I I II I I llll 
Db 17388 TAMTATTAGTAATAMTAAATTTTATTTAATTATTAGTTATTAAATTTAT 17: 

Qy 2932 catataatattaataatttaatcataattatactttggtgaatgtgacagtggggagata 29! 

II II I I I lllll II I II I III I III II II 
Db 17337 I1ATTATTAATTAAATATTAATAATGAATAGATTTTATTTAATTATTAATTATTAAATTT 17: 

Qy 2992 cgtaaagtattttaacattatactttttgcaagcagttggctggtctacccaagagtgat 30! 

llll I I I III II II I I II 

Db 17277 ATTTATTATTAAAATTTAAAAATTATTTTCATTTTAATATATATATATATATATATATAT 17: 

Qy 3052 caaagtttgagctgccttcaatgagccaatttttgcccataatggataaaggcaatttgt 3i: 

III I II II I III lllll II I III I 
Db 17217 AA1TTITATTAATTATTTTAAATAATTTTATTTATAAAATAATTTATTAIAAAAAIAGTT 17: 

Qy 3112 ttagttcaactgctcacagaataatgttaaaatgaaattaaaataaggtggcctggtcac 3i; 

I III llllll II I III llll I I II 
Db 17157 TATTMGTATAATTTAATAAATAATTTTTTTTTAAAA • - AAAAATATTTTTTTAAGTTTT 17: 

Qy 3172 acacacaaaaaaaaactaatgttggttggttgaattttatattacggaatgtaatattat 32: 

I I I II III II II II II lllll II I III 
Db 17099 AATTATATAATAAATTTATGAATAGGGGGAATAAATTTATTTTCA TTTTACATAT 17( 



Qy 3232 attttaaaataaaattatgttatttagattcttaatattttggagcattccatactataa 3291 

II I I III I III I lllll llllll III II 

Db 17044 ATATATATATATATATATATACAATTAATTAATTCAGAITIAGIGATTAAAATAAATTAT 16985 

Qy 3292 tttcgtaacataatattaaaatatagtaatataaagtgtaattaactttaaattacaagc 3351 

III II I llll I II I II llll III II I II II I I 

Db 16984 TTTATTATACTTATATAATTTAATTGAAAATTAAATTATATGTATATATATATAAATAIA 16925 

Qy 3352 ataatattaaattttgaatcaattaatttttatttctattattttaattaatttagtcta 3411 

m i iii n iii i m mi mm mi i n i 1 

Db 16924 TGAATTGAATTTTTATAAAAAATCATTTTAAATTTTTATTATATTAAAAATATTTTTATT 16865 

Qy 3412 ttttttcaaaataaaatttaaatctaaataaaaataatttttccttaatgttgaaacaac 3471 

II I III II llll I llll II II I I II II 

Db 16864 AATTATITAAAATAATTTTATTTATAAAATAATTTATTAT AAAAAT 16819 

Qy 3472 tcatgttatacttcaaaattataagtattatatttaccttgatgatttatttattagtat 3531 

I I I I II I llll llllll II lllll III I 
Db 16818 AGTTTATTAAGTATAATTTAATAAATCATTTTTTTTTAAAAAAAAAATATTTTTTAAGTT 16759 

Qy 3532 attaattctgattataattatggtgggatacaatcgctttccactaaatattttaactat 3591 

III I II I lllll I III III III lllll III 

Db 16758 TTAATTATACAATAAATTTATGAATAGGGGGAATAAATTTATTTTCATTTTTTTATATAT 16699 

Qy 3592 -gatttataaatttatttcaacatcgtatatttacttattaatacataatttatcataat 3650 

II llll II II II I II I lllll I lllll I I I I I I II II 
Db 16698 ATATATATATATATAATTAATTATTTCAGATTTAGIGATTAAAATAAATTATTTTATTAT 16639 

Qy 3651 tttatggaaattgagaccaagaaacattaagagaacaaattctataacaaagacaattta 3710 

II llll I lllll lllll llll I II 1.1. 

Db 16638 ATTTATATAATTTA ATTGAAAATTAAAATTATATATATATATATATATA 16590 

Qy 3711 gaaaaaaatgtacttttaggtaattttaagtactcttaaccaaacacaaaaattcaaatc 3770 

III I I I II III I I I I I I I III I II 

Db 16589 TAAATATAAATTGAATTTTTTAAAAATTATTTTTAATTTTTATTATAATAAAAATATTTC 16530 

Qy 3771 aaatgaactaaataagataatataacatacggaacatcttacttgtaatcttacattccc 3830 

II II II II lllll I I II II I III I II III' 
Db 16529 TTATTAATTATTTTAAATAATTTTATTTATAAAATAATTTATTATAAA- - ■ AATAGTTTA 16473 

Qy 3831 ataattttattatgaaaaataatcttatattactcgaactaaatgttgtcacaaattatt 3890 

III I II I I I lllllll lllll II I I II I I I III 
Db 16472 TTAAGTATAATTTAATAAATAATTTTTTTTTTAAAAAAAATATTTTTTTAAGTTTTAATT 16413 

Qy 3891 atctaaataaagaaaaacacttaatttttataacattttttcatatatttgaaagattat 3950 

II III III I III II I III I llll I I II 
Db 16412 ATATAA TAAATTTATGAATAGGGGGAATAAATTTATTTTCATTTTACATATATA 16359 

Qy 3951 attttgtatatttacgtaaaaatatttgacatagattgagcaccttcttaacataatccc 4010 

I I lllll II II II II II I lllll III Ml II 
Db 16358 TATATATATATATATATACAATTAATTAATTCAGATTTAGTGATTAAAATAAATTATTTT 16299 

Qy 4011 accataagtcaagtatgtagatgagaaattggtacaaacaacgtggggccaaatcccacc 4070 

I III I I II II III II I I I II llll 

Db 16298 ATTATACTTATATAATTTAATTGAAAATIAAATTATATGTATATATATATAAATATATGA 16239 

Qy 4071 aaaccatctctcattctctcctataaaaggcttgctacacatagacaacaatccacacac 4130 

lllll lllllll lllllll 

Db 16238 ATTGAATTTTTATAAAAAATCATTTTAAATTTTTATTATATTAAAAATATTTTTATTAAT 16179 

Qy 4131 aaatacacgttcttttctttctatttgattaaccatggctcatagca---ttcgtcaccc 4187 

llll llll III II I I llll I II 

Db 16178 TATTTAAAATAATTTTATTTATAAAATAATTTATTATAAAAATAGTTTATTAAGTATAAT 16119 

Qy 4188 tttcttccttttccaacttttactcataagtgtctcactagtgaccggtagccacactgt 4247 

II I I I llll I III II III II II I 

Db 16118 TTAATAAATCATTTTTTTTTAAAAAAAAAATATTTTTTAAGTTTTAATTATACAATAAAT 16059 

Qy 4248 ttcggca — gcggctcgacgtttattcgagacacaagcaacctcatcagagctcccac 4303 

III I II I ! lllll III llll 

Db 16058 TTATGAATAGGGGGAATAAATTTATTTTCATTTTTTTATATATATATATATATATATAAT 15999 
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Db 9076 ATATTTTTATTTATTTAAAATATTATATTAAATAATAAAGACAATATATTAAACATATAA 9017 

Qy 3147 aattaaaataaggtggcctggtcacacacacaaaaaaaaactaatgttggttggttgaat 3206 

I I M I I I II II II I II II II III 

Db 9016 ATTAATATAATATTTTATTTTATTTATTAATTTGTTTAATATATTATTATTTTATTTAAT 8957 
Qy 3207 tttatattacggaatgtaatattatattttaaaataaaattatgttatttagattcttaa 3266 

I I mi i ii mi inn i n mm urn i m 

Db 8956 TATTTATTIAATATATTATTATTITATITAATTATTTAATTATATTATTATTTTATTTAT 8897 
Qy 3267 tattttggagcattccatactataatttcgtaacataatattaaaatatagtaatataaa 3326 

ii m i ii m ii m i i mm iii 1 1 ii iii 

Db 8896 TAATTTATATATTTTAATATATTATTTTATTTATTTAATATAATATTTAATTTATTTAA- 8836 
Qy 3327 gtgtaattaactttaaattacaagcataatattaaattttgaatcaattaatttttattt 3386 

i mi i iii nm ii in i in inn i n 

Db 8837 -TATAATAATATTTTAATTATTTAATATAATATTTCATTTAATTCATTTAATATAATATT 8779 

Qy 3387 ctattattttaattaatttagtctattttttcaaaataaaatttaaatctaaataaaaat 3446 

II I I I lllllllll III I II I I I III I I 

Db 8778 TTAATTATATTATTAATTTATATATTTTAACATTTTATTTAATTTATTTAATATATATAA 8719 

Qy 3447 aatttttccttaatgttgaaacaactcatgttatacttcaaaattataagtattatattt 3506 

lllll I II I I I Mil II II III II II I Mill 
Db 8718 TATTTTATTTATTAATTATATTATTAATTTATATATTTTAATATTTTATTTAATTTATTT 8659 

Qy 3507 accttgatgatttatttattagtatattaattctgattataattatggtgggatacaatc 3566 

I I II llllll llllllll I I lllll I III II 

Db 8658 AATATAATATTTTATT TATTAATTATATTATTAATTTATTTAATATAATATT 8607 

Qy 3567 gctttccactaaatattttaactatgatttataaatttatttcaacatcgtatatttact 3626 

II III III I I III I II I II I I II lllll I 
Db 8606 TIATTTATTTAATTATATATATTATTAATTTATATATTTTAATATTTTATTTAATTTATT 8547 

Qy 3627 tattaatacataatttatcataattttatggaaattgagaccaagaaacattaagagaac 3686 

II II III III lllll III I I II I I I 

Db 8546 TAATATAATATTTTATTTATTAATTATATTATTAATTTATTTAATATAATAGTTTATTIA 8487 

Qy 3687 aaattctataacaaagacaatttagaaaaaaatgtacttttaggtaattttaagtactct 3746 

mi i i i ii ii i iiiii mm i 

Db 8486 TTTAATIATATATATTATTAATTTATATATTTTAATATTTTATTTAATTTATTTAATATA 8427 

Qy 3747 taaccaaacacaaaaattcaaatcaaatgaactaaataagataatataacatacggaaca 3806 

I I I Mil III II II I II I Ml I 
Db 8426 ATATTTTATTTATTAATTATATTATTAATTTATATATTTTAATATTTTATTTAATTTATT 8367 

Qy 3807 tcttacttgtaatcttacattcccataattttattatgaaaaataatcttatattactcg 3866 

i ii i mi ii iii iiiii mm n n n n n i 

Db 8366 TAATATATATAATATTTTATT-TATTAATTATATTATTAATTTATATATTTTAATATTTT 8308 

Qy 3867 aactaaatgttgtcacaaattattatctaaataaagaaaaacacttaatttttataacat 3926 

I III I lllll llll II II II II III llllll I 

Db 8307 ATTTAATTTATTTAATATAATATTTTATTTATTAATTATATTATTAA- • -TTTATATATT 8251 

Qy 3927 tttttcatatatttgaaagattatattttgtatatttacgtaaaaatatttgacatagat 3986 

II I I lllll I llllllll I II I II I MINI I I I 

Db 8250 TTAATATITTATTTAATTAATTATATTATTAATTTATATTTTTTAATAmTATTTTATT 8191 

Qy 3987 tgagcaccttcttaacataat 4007 

II II II III I 
Db 8190 TTATTTATTTAATATAATATT 8170 
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LOCOS DM037541 19517 bp DNA circular INV 04-APR-2000 

DEFINITION Drosophila melanogaster complete mitochondrial genome. 

ACCESSION U37541 

VERSION 037541.1 GI: 1166529 



SOURCE Drosophila melanogaster, 
ORGANISM Mitochondrion Drosophila melanogaster 

Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; 



TITLE 

JOURNAL 

MEDLINE 



AUTHORS 
TITLE 



JOURNAL 
MEDLINE 



AUTHORS 
TITLE 



JOURNAL 
MEDLINE 



AUTHORS 
TITLE 



AUTHORS 
TITLE 



JOURNAL 
MEDLINE 



AUTHORS 
TITLE 



JOURNAL 
MEDLINE 



AUTHORS 
TITLE 



JOURNAL 
MEDLINE 



AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 



Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; 
Muscomorpha; Ephydroidea; Drosophilidae; Drosophila. 

1 (bases 12511 to 12682) 

Clary, D.O., Goddard, J.M. , Martin, S.C., Fauron,C.M. and 
Wolstenholme,D,R, 

Drosophila mitochondrial DNA: a novel gene order 
Nucleic Acids Res. 10 (21), 6619-6637 (1982) 
83090428 

2 (bases 5269 to 5695) 

Clary, D.O., Wahleithner,J.A. and Wolstenholme,D,R, 

Transfer RNA genes in Drosophila mitochondrial DNA: related 5' 

flanking sequences and comparisons to mammalian mitochondrial tRNA 

genes 

Nucleic Acids Res. 11 (8), 2411-2425 (1983) 
83220794 

3 (bases 404 to 5272) 
de Bruijn,M.H. 

Drosophila melanogaster mitochondrial DNA, a novel organization and 
genetic code 

Nature 304 (5923), 234-241 (1983) 
83245048 

4 (bases 804 to 1778) 

Satta,Y., lshiva,H. and Chigusa , S . I . 

Analysis of nucleotide substitutions of mitochondrial DNAs in 
Drosophila melanogaster and its sibling species 
Mol. Biol. Evol. 4 (6), 638-650 (1987) 
88174373 

5 (bases 5268 to 13619) 
Garesse,R, 

Drosophila melanogaster mitochondrial DNA: gene organization and 
evolutionary considerations 
Genetics 118 (4), 649-663 (1988) 
88212147 

6 (bases 441 to 2967) 
SattaJ. and Takahata,N. 

Evolution of Drosophila mitochondrial DNA and the history of the 
melanogaster subgroup 

Proc. Natl. Acad. Sci. U.S.A. 87 (24), 9558-9562 (1990) 
91088557 

7 (bases 14215 to 14512) 

Ballard, J.W., 01sen,G.J., Faith,D.P., Odgers,W.A., Roweli,D.M. and 
Atkinson, P.W. 

Evidence from 12S ribosomal RNA sequences that onychophorans are 

modified arthropods 

Science 258 (5086), 1345-1348 (1992) 



TITLE 
JOURNAL 



FEATURES 

source 



8 (bases 14917 to 19517) 

Lewis, D.L, , Farr,C.L,, Farquhar,A.L. and Kaguni,L,S. 
Sequence, organization, and evolution of the A+T region of 
Drosophila melanogaster mitochondrial DNA 
MOl. Biol, Evol, 11 (3), 523-538 (1994) 
94285822 

9 (bases 1 to 408; 13319 to 19517) 
Lewis, D,L, , Farr,C.L. and Kaguni,L.S. 

Drosophila melanogaster mitochondrial DNA: completion of the 
nucleotide sequence and evolutionary comparisons 
Insect Mol. Biol. 4 (4), 263-278 (1995) 
96423163 

10 (bases 1 to 19517) 

Lewis, D, I., Farr,C.L. and Kaguni,L,S. 
Direct Submission 

Submitted (03-OCT-1995) Laurie S, Raguni, Biochemistry Department, 
Michigan State University, East Lansing, MI 48824-1319, USA 

Location/Qualifiers 

1. .19517 

/organism- "Drosophila melanogaster" 
/organel le- "mitochondrion " 
/db_xref-"taxon:7227" 

/note-"derived from new and previously submitted 
sequences; sequence is a composite containing sequences 
obtained from different Drosophila melanogaster strains" 
1. .65 

/gene-"mt:ND6" 
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AL031746.9 61:6594243 
KEYWORDS HTG. 

SOURCE malaria parasite P, falciparum. 
ORGANISM Plasmodium falciparum 

Eukaryota; Alveolata ; Apicomplexa; Haemosporida; Plasmodium. 
1 (bases 1 to 67970) 

Bowman, S,, Churcher,C, Harris, B,, Harris, D., Lawson,D., Quail, M. 
and Barrell, B. 
TITLE Direct Submission 

JOURNAL Submitted (24 -SEP- 1998) P. falciparum Genome Sequencing Consortium, 
The Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambridge 
CB10 ISA, UK 

COMMENT On Dec 16, 1999 this sequence version replaced gi: 5763807. 

For more information about this sequence or the Malaria Project, 
see http://www.sanger.ac.uk/Projects/P_falciparum, IMPORTANT: This 
sequence is unfinished and does not necessarily represent the 
correct sequence. Work on the sequence is in progress' and the 
release of this data is based on the understanding that the 
sequence may change as work continues . The sequence may be 
contaminated with foreign sequence from E.coli, yeast, vector, 
phage etc. 

FEATURES Location/Qualifiers 
source' 1. .67970 

/organism- " Plasmodium falciparum" 

/strain- "3D7" 

/dbjcref-"taxon:5833" 

/chromosome- "1" 
gene complement (1748. .3276) 

/gene-"MALlP3.01" 

CDS complement join(1748. .2598,2748, .2848,2990. .3276)) 

/gene-'MALlP3.01" 

/note-"MALlP3.01, conserved hypothetical protein, len: 412 
aa, similarity: UPF0006 family eg to 
YBL055C/YBL0512ABL0511, YBF5.YEAST (418 aa), fasta 
scores: opt: 316, E(): l.le-12, (33.21 identity in 271 aa 
overlap)" 
/codon_start-l 

/product- "conserved hypothetical protein, UPF0006 family" 

/protein_id-"CAB63556.1" 

/db_xref-"GI: 6594244" 

/translation-'MKLVFHYIKYINVLFYISIIFLKSNSLKIYNDLRYISTVNKYKV 
LQIKKRSNLKKNHNIRKMEDNESSFIDIGSNLTDKMFDGVYNSKKHENDLQNVLNRAK 
NNNVDKI 1 1 TCTCLAEIDKSLK ICET YDPEG KFLYLSAGVHPTNCYEF IDKNKHEEKE 
IIAKKEYEEFIKYFKNEQVENSKMENGNKKICDGEKDMNNLNEILLEKNLDTIPGFKY 
NERDREYLENLKNRIIKYPNWCIGEIGLDFDRLYFCSKYIQIKYFIFQLKLVQMFN 
LPMFLHMNCSETFFRIVDIYKFLFERNGGVIHSFTDKEDIVBIIVQNYRNLYIGVNG 
CSLKSLENINAVKKIPLNLLLLETDAPWCGVKRTHASYEYIRDTYERRAYTNLRKIRN 
I IKCDDNT I FKERNEPYNIA " 
misc.feature compleraent(2599. .2610) 
/gene-"MALlP3.01" 

/note-'potential splice acceptor sequence" 
misc.feature complement 2742. .2747) 
/gene»"MALlP3.01" 

/note-'potential splice donor sequence, atg/gttaaa" 
misc.feature complement) 2849. .2861) 
/gene-"MALlP3,01" 

/note-'potential splice acceptor sequence" 
misc.feature complement(2984. .2989) 
/gene-'MALlP3.01" 

/note-'potential splice donor sequence, aaa/gtaaaa" 
gene 5005. ,5496 

/gene-"MALlP3.02" 
CDS 5005. .5496 

/gene- "MALI P3. 02" 

/note- "MALI P3, 02, hypothetical protein, len: 163 aa, 
contains possible signal sequence" 
/codon_start-l 

/product- "hypothetical protein, MAL1P3.02" 
/protein id-"CAB63557.1" 
/db_xref-"GI: 6594245" 

/translation-'MKLLNNEFWLCPIIILFFFLNSWLGNNNRNNINFHETENAAK 
AMRKLLSGEINSIKLDNGDELKIKLNDEKHKDSTKWDKSYSFISNLEEEKYSQTDLFR 
RRQEINEANTKIIEDRQEFYILNNDEIENIATRFVLENNFDELYIQSFKQSLIDIIQS 



misc.feature 8020. ,10389 

/note- "possible cenl, region of very high [A+T] content" 
gene 14884. .20352 

/gene-"MALlP3.03" 
CDS 14884. .20352 

/gene-"MALlP3,03" 

/note-"MALlP3.03, putative ABC transporter, len: 1822 aa" 
/codon_start-l 

/product-'putative ABC transporter" 
/protein_id-"CAB63558,l" 
/db_xref="GI: 6594246" 

/translation-'MTTYKENVGISNKGNKRKKSCQNISFLNFLSFDWIRPLINDLIK 
GDIQELPNICRNFDVPYYASRLEENLRDIEVEDSEFYSEKNSSNEHVLHHCNSNDASE 
KRVYNVYYHNILWSILKTFKFRIILIISFYILETLIVTLGGKFIDYYMRILEGQKIPV 
YISFLRDFKVFSGLVWMIMFFHLFFEALLHFYFHLFTINLKVSLMYFLYRINLCSNN 
NHLQNPDAFYNTYRKFSSQTEIDEISRDFLSIGKNASSSSSGIRNNNKNIDNNKFVEN 
DYIINFIRSTRRMEKDSLNENRSLPNVNIYNIMFSDVPSVTFFVTSCINLFNVFVRIF 
MSFYVFHIKIGSNSVGIAIWLSIALYSAMILFEFLPSLFKSKYLIYRDRRIDNMHHVL 
KEFRLIRMFNWESFAFKYINIFRMREMKYCKIRLYLSNIGVFISSISSDIVEWIFFI 
YLKDRLNRKEEIKFTSIIMPLYVYRILISNVANFPNLVNNVMEGIVNIRRLNNYINDH 
LYYNDIRNYFMYRTRYNEDYNIWDKTFLQNENITSHDDGTSHNLRHLKNVIRNKLTN 
MFKYFFFYHRMNYHRNIINKQILSGLLKNVDDNTNKRICFQEHRSNSTYNYNSSHIHE 
KKEEYENIHNSSNSTMSNEFREKRKNNEYIIRLENCSFGLSYDNKCDNDHILKNINFN 
LRRNSLAIIIGPGSGKSAFFHSILGDFNMTHGNLYIENFFKRMPILYVPONSWLFMG 
NIRSMILFGNEYNPLIYKYTIIQSELLNDLSTIEHGDMKYINDDHNLSKGQRVRICLA 
RALYEHYIHMHKLCTDYEKRLIQPNEILDRDLINNRNISSYNNKKSRLVNYNIPFNEN 
YLQRCLMDDNNFYLYLLDDIFTSLDPS ISKKIFSNLFCKEDNISFKDNCSFI ISMNKS 
TLDNFLIEDILDNVQYEVNIFEIODKTLKYRGNISEYMEKNNLNITKESHWGYSNLNT 
IDYTRIKLFDEVELNHVRHSNRMIYKEAYFVKGNTESVSFEIDSINREYIRKMRKRNY- 



gene 



misc.feature 



YTLDTYTSNNSDREEIVRPLYKDTHEEFNKSSSMPFVKSSSNMINNPSNFRYEDNSSS 
FKGSISLETYLWYFQQVGFVLLTSWIFMLISIFTDEIKFVFLTMMSIISRNNREHSD 
TILQKOVRYLEYFVILPIISLVTSGICFSMIIYGNITSAIRVHNNILYSILNAPLYIF 
YNNNLGNIINRFIIDISAFDYGFLKRIYKAFFIFFRCILSSLLIIYMIRDCIFIFPFV 
IILIYFFVFKRFSRGCREAQRLYLSCHTPLCNIYSNALSGKNIINIYKKNTYHLDVYE 
HYINNFRISYFFRWLINIWASLYIKIFILLLTTYIIMHPHLYASGIIKLYKEKNYVRI 
LSTLGYCISFSARLGVIIKFLLCDYTHIEKEMCCVQRLEEFARISNKENASMNKENEL 



KRIPLVNGTYKYIDEEPSLRNINMYALRNQKIGIVGKSGAGRSTILLSILGLINISQG 

RITVEGRDIRTYNRKGEDSIIGILAQSSFVFYNWNIRTFIDPYNNFTDDEIVHALKLN 

GINLGRNDLYKYMHKQDMRSNYRRIIOTSRVINQSNDNTILLTNDCIRYLSLVRLYLN 

RHRYKIILIDEIPIFNLNNSVHDELNSFLIGKAKSFNYIIRNHFPNNTVLIISHHANT 

LSCCDYIYVLRKGEITYRCSYEDVKTQSELSHLLEMDD" 

23896. .31533 

/gene-"rRNA" 

/note-"region containing small subunit, 5.8S and large 
subunit rRNA genes and spacer regions" 
23896. .31533 
/gene-"rRNA" 

complement(31966. ,32775) 
/gene-»MALlP3,04" 

complement(join(31966. .32476,32675. .32775)) 
/gene-"MALlP3,04" 

/note-"MALlP3.04, conserved hypothetical membrane protein, 
len: 203 aa, similarity: P. falciparum chromosome 2, 
PFB0110W, 096126 predicted integral membrane protein (255 
aa), fasta scores: opt: 335, E(): 4.9e-15, (36.1% identity 
in 191 aa overlap)" 
/codon_start-l 

/product- "conserved hypothetical membrane protein, 
MAL1P3.04" 

/protein_id-"CAB63559.1" 
/db_xref-"GI:6594247" 

/translation-'MKKSYTFINVTILLFLTLLLFLTYYNYDTFSKTKFNNNIRIDIN 
RFKRIIAEASEEQKYPWEEDFCLILNEEELIRPEHNDSPYLPEHYENIDKINELSINS 
TKIWKETIKKMRQNYEKETDNMNHNWRDFMWHYRWANIYLYRVHRLINITLRDLTNPI 
HDKEETITTWIKWIQEDIEYFLFNLQVEWLRILTLELFYKNKE" 

complement(32477. .32486) 
/gene-"MALlP3.04" 

/note- "potential splice acceptor sequence" 
complement(32669. .32674) 
/gene-"MALlP3.04" 
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i ii i iiiii i i mm i i ii i ii mi 

Db 74140 ATATTATTAATTAAATTAAAATAATAATTTATTTTATAATTATATATATATATTAATAAT 74081 



Qy 2770 aattacaaaaaaaaaacaaacttaaaatcatttatcaatttgaacaacaaagcttggccg 282! 

I I llll I III II I I IIIII III II II I 
Db 74080 TAATTTAAAATATAAATTAAGAAAGATAATTTTATACTTTTATTTAATTAAATATATAGT 740! 

Qy 2830 aatgctaagagcttaaaaatggcttcttttgtttctttttgttgcaaacggtggagagaa 
IIIII II II I I Ml II II I II II 
Db 74020 AATAAATAAITTIATGTTATTTATTATAATAATATTTATTATTTTATTTTATTTAITTAA 7391 

Qy 2890 gagggaaatgaagattgaccatatttttttattatgttttaacat — ataatattaat 294! 

I llll II llll III I II! II II I! Ill IIIII! 
Db 73960 TAAATAATTAATTTTATAAAATATATTTATTTTAAATTAAAATATAAACAIATAATTAAT 73901 

Qy 2946 aatttaatcataattatactttggtgaatgtgacagtggggagatacgtaaagtatttta 300! 

I llll III llll III I III I I I III II III 

Db 73900 TAATTAAATATATATATATTTTTTTTAATATAATAAATAATATATTTCACATTTTAATTA 738' 

Qy 3006 acattata ctttttgcaagcagttggctggtctacccaagagtgatcaaagt 305' 

MM Mi II Ml I I I II II I 

Db 73840 AAATAAAAATAACCATTTATTAATTAACTTAATTAATATATAAAATAAAATAATTTAATT 73781 

Qy 3058 ttgagctgccttcaatgagccaatttttgcccataatggataaaggcaatttgtttagtt 311' 

II I I II I II III IIIII II I I I I II I 
Db 73780 GTGTAAITAAATTAAATATAAAACATTTATTAATAATTAATTATATATAATATTATATAT 737! 

Qy 3118 caactgctcacagaataatgttaaaatgaaattaaaataaggtggcctggtcacacacac 317' 

III I I I II III I llll llll I I I I 
Db 73720 TATCTTAAATTAATTAATTTTTTTAATTATTTTAATATAATAATTATTTATTAATATTAA 7361 

Qy 3178 aaaaaaaaactaatgttggttggttgaattttatattacggaatgtaatattatatttta 323' 

llll II III II I II IIIII I II Mil I 
Db 73660 TTATGTATATTTTATTTAATTGTTTAATATT ■ -TATTATTATTTTATTTAATATATTATT 73603 

Qy 3238 aaataaaattatgttatttagattcttaatattttggagcattccatactataatttcgt 329' 

II I II I II I I II IIIII II II I III III I 
Db 73602 AATTTAATTAATTATTATATTATATTTAATTATTATATTATTIATAATATATTATTAATT 735' 

Qy 3298 aacataatattaaaatatagtaatataaagtgtaattaactttaaattacaagcataata 335' 

I II I II I III II llll I I I III I II III I II II! 
Db 73542 TAATTATTGTTTATTTATTATATTATATATTATTATTTAATTATTATTTTATTTATTATA 73483 

Qy 3358 ttaaattttgaatcaattaatttttatttctattattttaattaatttagtctatttttt 341' 

III II II I I Ml I I II II III II II I llll II 
Db 73482 TTATAT ATTATTAATTTATATAATATAATTTAATTATATATATATTTAATTTATT 734! 

Qy 3418 caaaataaaatttaaatctaaataaaaataatttttccttaatgttgaaacaactcatgt 347' 

I I lllllll I II II I Mill II II! I I 
Db 73427 TATATATTAATTTAATTATATATITATTTAATTTAATTATATATTTTATTTATTAATTTA 7331 

Qy 3478 tatacttcaaaattataagtattatatttaccttgatgatttatttattagtatattaat 353' 

I I II I II Ml Mil I I IIIII llll MM I 
Db 73367 TTTTATTTATTAATTTAATTTAATTATATATATATTTAATTTAATTATATATATATTTAA 73308 

Qy 3538 tctgattataattatggtgggatacaatcgctttccactaaatattttaactatgattta 359' 
I I MM llll I I I I I I I III I II I II 

Db 73307 TTTAATTATATATATATTTAATTTAATTGTATATATATTTAATTTATTTATATATTTATT 7324 



Qy 3598 taaatttatttcaacatcgtatatttacttattaatacataatttatcataattttatgg 3657 

III II II I I III II IIIII I! I I I! I I III! 

Db 73247 TAATTTATTTATATATTTAATTATATATTTATTTTTATTTTTTATAACTATAAATTATTA 73188 



Qy 3658 aaattgagaccaagaaacattaagagaacaaattctataacaaagacaatttagaaaaaa 371" 

I I I II llll Ml I II II II I 
Db 73187 ATTTAATTATTAATTTTATTTTATTTACTAAATATATAATTAATTTATATATATTATTGT 7311 

Qy 3718 atgtacttttaggtaattttaagtactcttaaccaaacacaaaaattcaaatcaaatgaa 377] 

I I Ml I I! I I I I I I I I I I I III 
Db 73127 TTTAATTATTTAAATAATTCATTTTATTTAATTAATTTATATATTATTATTAATAATTCT 73 Of 

Qy 3778 ctaaataagataatataacatacggaacatcttacttgtaatcttacattcccataattt 383; 
III LI I I III I I II II I III III I I 



Db 73067 TTAATTTATAATAATTAATTGTTTAATATATATTATTATATTTAATTATTTAAATATTGT 73008 

Qy 3838 tattatgaaaaataatcttatattactcgaactaaatgttgtcacaaattattatctaaa 3897 

II I II I II III I I II II! I I III III III 
Db 73007 TAATTAATTAATTTATTATATTATTATTTAATTAATTTATTATATTATTATTTAATTAAT 72948 

Qy 3898 taaagaaaaacacttaatttttataacattttttcatatatttgaaagattatattttgt 3957 

I I II III I Ml II II IIIII I llll! 
Db 72947 TTATATTAATATITTATTATTTAATTTTAATTAAAATTTATTTATTATTATTTTATTTTA 72888 

Qy 3958 atatttacgtaaaaatattt 3977 

I II I IIIII 
Db 72887 TATTAATATTAGTATTATTT 72868 
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AC004157 

LOCOS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 
ORGANISM 

REFERENCE 



FEATURES 
source 



BASE COUNT 
ORIGIN 



AC004157 130281 bp DNA HTG 15-MAR-2000 

Plasmodium falciparum chromosome 12 clone 3D7, *** SEQUENCING IN 
PROGRESS ***, 3 unordered pieces. 
AC004157 

AC004157.6 61:7243830 
HTG; HTGS.PHASE1. 

malaria parasite P. falciparum, 
Plasmodium falciparum 

Eukaryota; Alveolata; Apicomplexa; Haemosporida; Plasmodium. 

1 (bases 1 to 130281) 

Hyman,R.W,, Fung,E.L., Qin,F., Rowley,D., Tamaki,T., Kurdi,O.B., 

Conway, A. B. and Davis, R.W. 

Plasmodium falciparum 3D7 chromosome 12 

Unpublished 

2 (bases 1 to 130281) 

Hyman,R.W., Qin,F., Fung,E,L, , Conway,A.B, and Davis, R.W. • 
Direct Submission 

Submitted (19-FEB-1998) Stanford dna Sequencing and Technology 
Center, Stanford University, 855 California Avenue, Palo Alto, CA 

94304, USA 

On Mar 15, 2000 this sequence version replaced gi: 6652498? 

* NOTE: This is a 'working draft' sequence. It currently •■ 

* consists of 3 contigs. The true order of the pieces ■ '• 

* is not known and their order in this sequence record is" 

* arbitrary, Gaps between the contigs are represented as 

* runs of N, but the exact sizes of the gaps are unknown. 

* This record will be updated with the finished sequence 

* as soon as it is available and the accession number will 

* be preserved. 

* 1 67262: contig of 67262 bp in length 

* 67263 67462: gap of unknown length 

* 67463 82485: contig of 15023 bp in length 

* 82486 82685: gap of unknown length 

* 82686 130281: contig of 47596 bp in length. 

Location/Qualifiers 
1. .130281 

/organism="Plasmodium falciparum" 
/db_xref-"taxon:5833" 
/chromosome= "12" 
/clone-" 3D7" 

52250 a 11780 c 11855 g 53996 t 400 others 



Query Match 3,41; Score 189.6; DB 60; Length 130281; 

Best Local Similarity 44.9*; Pred. No, 8e-14; 

Matches 979; Conservative 0; Mismatches 1179; Indels 22; Gaps 6 

Qy 1810 aaataattattaattaaaatttatggacttttggactgtctgactaattttcagaatttt 1869 

MM I Mil II I II I III I Ml I I INI 
Db 98635 AAATAAATCTTAATAAATAATTTTTTTTGATAGATTTTCTAGGATAATATGAAATATTTC 98694 

Qy 1870 attttggttttgggttttgttgaattttttagataattattttaaatattctgcataatt 1929 

I I I I Ml III III! III! Ill II I I 
Db 98695 GTAAAAAAAATAATAATAAAAATGATATTTAAATATTTATATTAACTA--CAATATTAAT 98752 
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YSSEVN" 
CDS 4736. .5524 

/gene- "nit :ND6" 
/codon_start-l 

/dbjcref - " FlyBase : FBgn 00 1 3685" 
/transl_table-5 

/product-'cytochrome c oxidase subunit III" 

/protein_id-"AAC47816.1* 

/db_xref-"GI:1166535" 

/translation-'MSTHSNHPFHLVDYSPWPLTGAIGAMTTVSGMVIWFHQYDISLF 
VLGNI ITILTVYQWWRDVSREGTYQGLHTYAVT IGLRWGMILFILSEVLFFVSFFWAF 
FHSSLSPAIELGASWPPMGIISFNPFOIPLLNTAILLASGVTVTWAHHSLMENNHSQT 
TQGLFFTVLLGiyFTILQAYEYIEAPFTIADSIYGSTFFMATGFHGIHVLIGTTFLLV 
CLLRHLNNHFSKNHHFGFEAAAWYWHFVDVVWLFLYITIYWWGG " 
tRNA 5543, .5607 

/gene- "nit :ND6" 

Query Match 3.4%; Score 189.6; DB 58; Length 19517; 

Best Local similarity 44.34; Pred. No. 1.3e-13; 

Matches 1063; Conservative 0; Mismatches 1309; Indels 28; Gaps 6; 

Qy 1546 ttttcgaaaaaaatttgcattgtgtttttctgaaaaatattgcattaacataatcatgca 1605 

Mil MINI! I II I || I llll III Mill I I 
Db 17119 TTTTTTAAAAAAAAATTATTTATTAAATTATACTTAATAAACTATTTTTATAATAAATTA 17178 

Qy 1606 ttctcaattttggtcaattgaacgttataaaattctctatgatatcctgatctgtttatt 1665 

II I I I III II I II II I Ml I II I I III 
Db 17179 TTTTATAAATAAAATTATTTAAAATAATTAATAAAAATTATATATATATATATATATATA 17238 

Qy 1666 acattatatg-tgtttatgcttgagttaagtcaaacattgagattcatagctcacccaat 1724 

I I III I II II I II II I I I II III I I 
Db 17239 TATTAAAATGAAAATAATTTTTAAATTTTAATAATAAATAAATTTAATAATTAATAATTA 17298 

Qy 1725 tatttaatcatttcaggcaatctgcagacttaggattggatggcgttcaggagcttggat 1784 

II llll llll I I III I Ml II I I II 

Db 17299 AATAAAATCTATTCATTATTAATATTTAATTAATAATAAATAAATTTAATAACTAATAAT 17358 

Qy 1785 tggttttctcacatcatattttattaaataattattaattaaaatttatggacttttgga 1844 

ii ii i ii ii linn iiii imi iiii iiii 

Db 17359 TAAATAAAATTTATTTATTACTAATATTTAATTAATAATAAAAAATTATTTTTTTTTTTT 17418 

Qy 1845 ctgtctg-actaattttcagaattttattttggttttgggttttgttgaattttttaga 1902 

III I llllll llll I II III III I II 

Db 17419 TTTTTTTTTAATAATTTAATTAATTATTATATATTTATAAATTTATATATTATTGAATAT 17478 

Qy 1903 taattattttaaatattctgcataatttttctgttatttgaaaaggatgttcgaattttt 1962 

I II II I I I I I II II llllll II I I I II llll 

Db 17479 TTATAATATATATATATATATAGAAAAATTAAATTATTTAAATAATTTAATATAAATTTT 17538 

Qy 1963 tttcaaaattgaaacgtttaagaatttttactactgcaaattcagaataagtgaatttgt 2022 

II III II I I II I Mil I III Mill II I I 

Db 17539 TTAAAAATTTCTTAAATGTATTATTTTTATAAAAAATATTTATATAATAAAATCATGTTT 17598 

Qy 2023 tttttagaaagattaaataa-gttagtattacgatttttagtttgatttggtggaaagt 2080 

III I II I III II III II Mill I II II 

Db 17599 TTTAAAAAATAAACAAAAAATTTTTAATAAATAAATTTTATAATGAAATATAATTTATTT 17658 

Qy 2081 aatgtatgtttttgaacataattatttgacaataattaagttttctagggaataaacgga 2140 

I I I Mill II I II I III I II II III MM I II III 
Db 17659 ATTTTTCATTTTTAAAAAAAAATTTTTTAAAAAAAATAATTTTTTTTTTAAAAAAAAACT 17718 

Qy 2141 aatatcttcttcttttttgtaaaattactaatgcaagaacaaacaacgttttggggagca 2200 

I II II I II I II II I I II II 
Db 17719 ATATACTAATTATAAATTAATAGATATTTATATATATATAAATATTTAATATATTATTAT 17778 

Qy 2201 aataatctagctttaagtagtcagtgtaactctcaaaatctggtcataacttctaggctg 2260 

• i i mil ii i 1 1 i i in n mi i ii 

Db 17779 ATATCTAATAATTTAAATAAAAAATITTAAAATTTAAAAATGTAGATATAATTTATAAAA 17838 

Qy 2261 agtttgctgtgctacagtagtaagtctatagaaacttacctgacaaaacgacatgacgtc 2320 

III I I II I I I I I III Ml llll 
Db 17839 ATTTATATTCTCATATTTATTTATTATTAATTTAATTTATATAAATAATATAAIAATTTA 17898 



Qy 2321 agggtcgaatctacaacttttcctttttcttcaattaacatatggttgattcaagttccg 2380 

I I I M I I I III I I III I llllll 
Db 17899 ATTAATTATTATATATT1ATAAATTIATAIATTAITGAAIATTIATATAATATATATATA 17958 

Qy 2381 atctataataatttattacgatttatcaatttcaattaccttatatcatcctattataaa 2440 

I II III II I Mill I II II I I I II llll 
Db 17959 TATATAGAAAAATTAAATTATTTAAATAATTTAATATAAATTTTTTAAAAATTTCTTAAA 18018 

Qy 2441 tataagtcagttcaattcagttttcgaaagttcccaaaaattttgaattttattaaattt 2500 
I II II I I I II I I I I Ml I llll 

Db 18019 TGTATTATTTTTATAAAAAATATTTATATAATAAAATCATGTTTTTTAAAAAATAAACAA 18078 

Qy 2501 attccctaaaaccgaaatagttatatctttcaaatttaagtttcatttttcaatccgatt 2560 

I I II I II II I I II M 1 1 1 1 f I M 1 1 II 

Db 18079 AAAATTTTTAATAAATAAATTTTATAATGAAATATAATTTATTTATTTTTCAATTTTTTT 18138 

Qy 2561 tcaatttcatccttttataactctctattatctataattacataaatttcaaattaattt 2620 

II I I III II II I II II II I I I Mill 
Db 18139 AAAAAATTTTTTAAAAAAAATAATTTTTTTTTTAAAAAAAAACTATATACTAATTATAAA 18198 

Qy 2621 tgaaatatttacactttagtccctaagttcaaaactataaattttcactttagaaattaa 2680 

I II I Ml II I M III I I I I I I II llll 
Db 18199 TIAAIAGATATTIATATATATATAAATATTTAATATATTATTATATATCTAATAATTTAA 18258 

Qy 2681 tcatttttcacatctaagcatcaaatttaaccaaatgacacaaatttcatgattagttag 2740 

I I I II I I III Ml II II I II 

Db 18259 ATAAAAAATTTTAAAATTTAAAAATGTAGAIATAATTTATAAAAATTTATATTCICAIAT 18318 

Qy 2741 atcaagcttttgagtcttcaaaacataaaaattacaaaaaaaaaacaaacttaaaatcat 2800 

I II I I I I I Mill llllll II II I I II 

Db 18319 TTATTTATTATTAATTTAATTTATATAAATAATATAATGATTTAATTAATTATTATATAT' 18378 

Qy 2801 ttatcaatttgaacaacaaagcttggccgaatgctaagagcttaaaaatggcttcttttg 2860 

MM Mill I I I I I IMI llllll II 

Db 18379 TTATAAATTTATATATTATTGAATATTTATATAATATATATATATATATAGAAAAATTAA 18438 

Qy 2861 tttctttttgttgcaaacggtggagagaagagggaaatgaagattgaccatattttttta- 2920 

II III I III I II I II llll I I' 

Db 18439 ATTATTTAAATAATTTAATATAAATTTTTTAAAAATTTCTTAAATGTAITATTTTTATAA 18498 

Qy 2921 ttatgttttaacatataatattaataattt----aatcataattatactttggtgaatgt 2976 

I llll llll III III IMI III I I 

Db 18499 AAAATATTTATAIAATAAAATCATGTTTTTTAAAAAATAAACAAAAAATTTTTAATAAAT 18558 

Qy 2977 gacagtggggagatacgtaaagtattttaacattatactttttgcaagcagttggctggt 3036 

II II III llll II I Mill II I II 

Db 18559 AAATTTTATAATGAAATATAATTTATTIATTTTTCAATTTTTTTAAAAAATTTTTTAAAA 18618 

Qy 3037 ctacccaagagtgatcaaagtttgagctgccttcaatgagccaatttttgcccataatgg 3096 

II MM llllll llll I M 

Db 18619 AAAATAATTTTTTTTTTAAAAAAAAACIATATACTAATTATAAATTAATAGATATTTATA 18678 

Qy 3097 ataaaggcaatttgtttagttcaactgctcacagaataatgttaaaatgaaattaaaata 3156 

I llll I III I llllll I II I Mill M I 
Db 18679 TATATATAAATATTTAATATATTATTAIATAICTAATAATTTAAATAAAAAATITTAAAA 18738 

Qy 3157 aggtggcctggtcacacacacaaaaaaaaactaatgttggttggttgaattttatattac 3216 

II II I I Mill III llll llll 
Db 18739 TTTAAAAATGTAGATAIAATTTATAAAAATTTATATTCTCATATTTATTTATTATTAATT 18798 

Qy 3217 ggaatgtaatattatattttaaaataaaattatgttatttagattcttaatattttggag 3276 

II III II I I II Mill I I II III III II I I 
Db 18799 TAATITATATAAATAATATAATGATTTAATTAATTATTATATATTTATAAATTTATATAT 18858 

Qy 3277 cattccatactataatttcgtaacataatattaaaatatagtaatata 3324 

III III I II II MM I II llll II 

Db 18859 TATTGAATATTTATATAATATATATATATATATAGAAAAATTAAATTATTTAAATAATTT 18918 

Qy 3325 aagtgtaattaactttaaattacaagcataatattaaattttgaatcaattaatt 3379 

II I II llll II II I MM II II II 
Db 18919 AATATAAATTTTTTAAAAATTTCTTAAATGTATTATTTTTATAAAAAATATTTATATAAT 18978 

Qy 3380 tttatttctattattttaattaatttagtctattttttcaaaataaaatttaaatctaaa 3439 
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Qy 2381 atctataataatttattacgatttatcaatttcaattaccttatatcatcctattataaa 2440 

I II III II I Mill I II II II I II Mil 

Db 3043 TATATAGAAAAATTAAATTATTTAAATAATTTAATATAAATTTTTTAAAAATTTCTTAAA 3102 

Qy 2441 tataagtcagttcaattcagttttcgaaagttcccaaaaattttgaattttattaaattt 2500 

III II I I I II I I I I III I llll 
Db 3103 TGTATTATTTTTATAAAAAATATTTATATAATAAAATCATGTTTTTTAAAAAATAAACAA 3162 

Qy 2501 attccctaaaaccgaaatagttatatctttcaaatttaagtttcatttttcaatccgatt 2560 

I I II I I I II llll II llllllllll II 
Db 3163 AAAATTTTTAATAAATAAATTTTATAATGAAATATAATTTATTTATTTTTCAATTTTTTT 3222 

Qy 2561 tcaatttcatccttttataactctctattatctataattacataaatttcaaattaattt 2620 

llll III I I II I II II I I I I I Hill 
Db 3223 AAAAAATTTTTTAAAAAAAATAATTTTTTTTTTAAAAAAAAACTATATACTAATTATAAA 3282 

Qy 2621 tgaaatatttacactttagtccctaagttcaaaactataaattttcactttagaaattaa 2680 

III I I II II I II III I I I I I I II llll 
Db 3283 TTAATAGATATTTATATATATATAAATATTTAATATATTATTATATATCTAATAATTTAA 3342 

Qy 2681 tcatttttcacatctaagcatcaaatttaaccaaatgacacaaatttcatgattagttag 2740 

I I I II I I III III II II I II 

Db 3343 ATAAAAAATTTTAAAATTTAAAAATGTAGATATAATTTATAAAAATTTATATTCTCATAT 3402 

Qy 2741 atcaagcttttgagtcttcaaaacataaaaattacaaaaaaaaaacaaacttaaaatcat 2800 

I II I I I I I lllll I II II I II II I I II 

Db 3403 TTATTTATTATTAATTTAATTTATATAAATAATATAATGATTTAATTAATTATTATATAT 3462 

Qy 2801 ttatcaatttgaacaacaaagcttggccgaatgctaagagcttaaaaatggcttcttttg 2860 

llll lllll lllll II II II I II I II 
Db 3463 TTATAAATTTATATATTATTGAATATTTATATAATATATATATATATATAGAAAAATTAA 3522 

Qy 2861 tttctttttgttgcaaacggtggagagaagagggaaatgaagattgaccatattttttta 2920, 

II III I III I I I I II III! I I 

Db 3523 ATTATTTAAATAATTTAATATAAATTTTTTAAAAATTTCTTAMTGTATTATTTTTATAA 3582 

Qy 2921 ttatgttttaacatataatattaataattt— -aatcataattatactttggtgaatgt 2976 

I llll llll II I III II I I I I III I I 

Db 3583 AAAATATTTATATAATAAAATCATGTTTTTTAAAAAATAAACAAAAAATTTTTAATAAAT 3642 

Qy 2977 gacagtggggagatacgtaaagtattttaacattatactttttgcaagcagttggctggt 3036 

II I I • II I llll II I lllll II I II 

Db 3643 AAATTTTATAATGAAATATAATTTATTTATTTTTCAATTTTTTTAAAAAATITTTIAAM 3702 

Qy 3037 ctacccaagagtgatcaaagtttgagctgccttcaatgagccaatttttgcccataatgg 3096 

II I I II I II I I I llll I II 

Db 3703 AAAATAATTTTTTTTTTAAAAAAAAACTATATACTAATTATAAATTAATAGATATTTATA 3762 

Qy 3097 ataaaggcaatttgtttagttcaactgctcacagaataatgttaaaatgaaattaaaata 3156 

I III I I III I llllll I II I lllll II I 

Db 3763 TATATATAMTATTTAATATATTATTATATATCTMTMTTTAAATAAAAAATTTTAAAA 3822 

Qy 3157 aggtggcctggtcacacacacaaaaaaaaactaatgttggttggttgaattttatattac 3216 

II I I I I lllll II I llll llll 

Db 3823 TTTAAAAATGTAGATATAATTTATAAAAATTTATATTCTCATATTTATTTATTATTAATT 3882 

Qy 3217 ggaatgtaatattatattttaaaataaaattatgttatttagattcttaatattttggag 3276 

I I III II I I II lllll I I II III III II I I 
Db 3883 TAATTTATATAAATAATATAATGATTTAATTAATTATTATATATTTATAAATTTATATAT 3942 

Qy 3277 cattccatactataatttcgtaacataatattaaaatatagtaatata 3324 

III III I II II llll I I I I III II 

Db 3943 TATTGAATATTTATATAATATATATATATATATAGAAAAATTAAATTATTTAAATAATTT 4002 

Qy 3325 aagtgtaattaactttaaattacaagcataatattaaattttgaatcaattaatt 3379 

II I I I II II II II I III I II II I I 
Db 4003 AATATAAATTTTTTAAAAATTTCTTAAATGTATTATTTTTATAAAAAATATTTATATAAT 4062 

Qy 3380 tttatttctattattttaattaatttagtctattttttcaaaataaaatttaaatctaaa 3439 

II I II III II I I llllll IIIIM llll I III 

Db 4063 AAAATCATTTTTTTTTAAAAATAAACAAAAAATTTTTAATAAATAAATTTTATAATGAAA 4122 



Qy 3440 taaaaataatttttccttaatgttgaaacaactcatgttatacttcaaaattataagtat 3499 

II II I llll I II II II I II II II II lllll I I I 
Db 4123 TATAATTTATTTATTTTTCATTTTTTTAAAAAAAATTTTTTAAAAAAAAATATTTTTTTT 4182 

Qy 3500 tatatttaccttgatgatttatttattagtatattaattctgattataattatggtggga 3559 

II I I I I I III III III llll II I 
Db 4183 TAAAAAAAAACTATATACTAATTATAAATTAATAGATATTTATATATATATAMTATTTA 4242 

Qy 3560 tacaatcgctttccactaaatattttaactatgatttataaatttatttcaacatcgtat 3619 

lllll llll lllll II I I I II I I lllll 
Db 4243 ATATATTATTATATATCTAATAATTTAAATAAAAAATTTTAAAATTTAAAAATATAGATA 4302 

Qy 3620 atttacttattaatacataatttatcataattttatggaaattgagaccaagaaacatta 3679 

I II III III I I I! Mill II I IIIIM 

Db 4303 TAATTTATAAAAATTTATATTCTCATATTTATTTATTATTAATTTAAITTATATAAATAA 4362 

Qy 3680 agagaacaaattctataacaaagacaatttagaaaaaaatgtacttttaggtaattttaa 3739 

llllll lllll III II II I II II II II 
Db 4363 TATAATGATTTAATTAATTATTATATATTTATAAATTTATATATTATTGAATATTTATAT 4422 

Qy 3740 gtactcttaaccaaacacaaaaattcaaatcaaatgaactaaataagataatataacata 3799 

lllll I llll I llll llllll llll II I I 
Db 4423 ATAATATATATATATATAGAAAAATAAAATTATTTAAATAATTTTTCATAAAATTTAAAA 4482 

Qy 3800 cggaacatcttacttgtaatcttacattcccataattttattatgaaaaataatcttata 3859 

II lllll llll I II II I II III lllllllll III 

Db 4483 ■-AAATTTCTTAAATGTATTATTTAATAAAAAATTACTTTTTAAAAAAAATAATTTTAAT 4540 

Qy 3860 ttactcgaactaaatgttgtcacaaattattatctaaataaagaaaaacacttaattttt 3919 

II I II II II I II I I III III II I I Ml IN 
Db 4541 TTTTTAAAAAAAATAGTAAATAATAAAAAAAAAAAAAAAAAAAAATGAAAATTATATTAT 4600 " 
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DM037541 19517 bp DNA circular INV 04-APR-2000 
Drosophlla melanogaster complete mitochondrial genome. ; 
U37541 

037541.1 61:1166529 

Drosophlla melanogaster. 

Mitochondrion Drosophlla melanogaster 

Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 

Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; 

Muscoraorpha; Ephydroidea; Drosophilidae; Drosophila. 

1 (bases 12511 to 12682) 

Clary, D.O., GoddardJ.M., Martin, S.C., Fauron,C.M. and 
Wolstenholme,D.R. 

Drosophila mitochondrial DNA: a novel gene order 
Nucleic Acids Res. 10 (21), 6619-6637 (1982) 
83090428 

2 (bases 5269 to 5695) 

Clary, D.O., Wahleithner, J.A. and Wolstenholme,D.R, 

Transfer RNA genes in Drosophila mitochondrial DNA: related 5' 

flanking sequences and comparisons to mammalian mitochondrial tRNA 

genes 

Nucleic Acids Res. 11 (8), 2411-2425 (1983) 
83220794 

3 (bases 404 to 5272) 
de Bruijn,M.H. 

Drosophila melanogaster mitochondrial DNA, a novel organization and 
genetic code 

Nature 304 (5923), 234-241 (1983) 
83245048 

4 (bases 804 to 1778) 

Satta,Y., lshiwa,H. and Chigusa,s.l. 

Analysis of nucleotide substitutions of mitochondrial DNAs in 

Drosophila melanogaster and its sibling species 

Mol. Biol. Evol. 4 (6), 638-650 (1987) 

88174373 

5 (bases 5268 to 13619) 
Garesse,R. 

Drosophila melanogaster mitochondrial DNA: gene organization and 
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Qy 3479 atacttcaaaattataagtattatatttaccttgatgatttatttattagtatattaatt 3538 

III I I III II MM I II II III I I II llll I 
Db 138317 ATAATAAATAATATTTATAATTAAGTATAATAATATTATTAAATAATATAAATATATAM 138258 

Qy 3539 ctgattataattatggtgggatacaatcgctttccactaaatattttaactatgatttat 3598 

II II III II II I lllllll II II I 

Db 138257 TATATATTATATATATTTTATTTATTATAAATTTATTATATTATTTTATATAATTTTAAA 138198 

Qy 3599 aaatttatttcaacatcgtatatttacttattaatacataatttatcataattttatgga 3658 

III I III I I Mill I I II III Mill II I I 

Db 138197 AAAAT AATTATATATTAATATATAAAAATTAAGTATATAAAAATATTATAATATT - TATA 138139 

Qy 3659 aattgagaccaagaaacattaagagaacaaattctataacaaagacaatttagaaaaaaa 3718 

II I II II II II I Mill II I I I II III 

Db 138138 TTTTATAAATAAATAAAATAAATTATATAAATTTTAATATATATTTTATAATTTAAATTT 138079 

Qy 3719 tgtacttttaggtaattttaagtactcttaaccaaacacaaaaattcaaatcaaatgaac 3778 

III II II II I I I I I I III II II 

Db 138078 ATTAATACATAAATATATTTTATATATATTAATTATTATATATATTATTATTTTTTATAA 138019 

Qy 3779 taaataagataatataacatacggaacatcttacttgtaatcttacattcccataatttt 3838 

I 1 1 II I 1 1 . 1 . I I I 1 1 I I 1 1 1 1 I I llll 

Db 138018 AATATTATACATTAATATTAATTAATTTTTATATTATTTATTATATAATGAGTTTATAAT 137959 

Qy 3839 attatgaaaaataatcttatattactcgaactaaatgttgtcacaaattattatctaaat 3898 

II II II llll I I I II I I II I llll I II 

Db 137958 ATAATATAATTTAATATATATAATAGTATATAATATATATTATAAATTAATTAATTTAAA 137899 

Qy 3899 aaagaaaaacacttaatttttataacattttttcatatatttgaaagattatattttgta 3958 

i i i i i i mill i ii ii iiii mm i m mini i 

Db 137898 ATAAATAIAAIATAAATTTTAAAAATATATTTTAATATATMAAAATMTATATTTAAAA 137839 

Qy 3959 tatttacgtaaaaatatttgacatagattgagcaccttcttaa-cataatcccaccata 4016 

II I II I I III . llll I II II II I II 

Db 137838 ATTTAATTTATATAMTTAATATGTAATATAATAAAATAAAAAATAATTATATTATAATT 137779 

Qy 4017 agtcaagtatgtagatgagaaattggtacaaacaacgtggggccaaatcccaccaaacca 4076 

II III II II I I III III II I II I 

Db 137778 TTAAAATTATATAAATAATATATTAAGTTAAATAAATTATTTTATTTATATGATTAAATA 137719 

Qy 4077 tctctcattctctcctataaaaggcttgctacacatagacaacaatccacacacaaatac 4136 

I I III llll I II I I I II II lllllll 
Db 137718 TAAATAATTTATATTTATATAGTTGTTAAAATATTTTGATTTATATTATATAACAAATAA 137659 

Qy 4137 acgttcttttctttctatttgattaaccatggctcatagcattcgtcaccctttcttcct 4196 

I I I II Hill II III MUM 

Db 137658 AATATTGGTAATTATTATTTAATGTCGGTATAATTAATATAACATCCTTCTATTATTAAT 137599 

Qy 4197 tttccaacttttactcataagtgtctcactagtgaccggtagccacactgtttcggcagc 4256 

I II II II I I I I I I II III 

Db 137598 CATTCACCTGATATATAATATATTATTATCTTATATGTCCTAAAAAATATANTAGNNTGA 137539 

Qy 4257 ggctcgacgtttattcgagacacaagcaacctcatcagagctcccacaattggcttcaaa 4316 

II II III 

Db 137538 TATTCACATAATGATATCATGATATGTNNNNNNMNNNMNW 137479 

Qy 4317 atacgaaagcacgagagtctgaatacgaaaagccagaatacaaacagccaaagtatcacg 4376 

II I 

Db 137478 fMNNNNNNNNNNNWTONNNNNNNNNNW 137419 

Qy 4377 aagagtactcaaaacttgagaagcctgaaatgcaaaaggaggaaaaacaaaaaccctgca 4436 

I Ml I llll llll llll 
Db 137418 TATAATATAAGTGATAGTTATAATATTAACATTATATGATGTTGATATGATATATATATA 137359 

Qy 4437 aacagcatgaagagtaccacgagtcacacgaatcaaaggagcaaaaagagtacgagaaag 4496 

I I I I II II III II I I I II 

Db 137358 TATAAAAATATATATATATATTTAATATATTATAAAATGAAATATTAAATAGIGATTTTA 137299 

Qy 4497 aaaatctcgacgaattcccccgggcgtcgacggctagcgaagatcttcgggcccgtcgag 4556 

lllllll II llll I 

Db 137298 TATATATGGATATAAAAGTATTAGTATTATATAAATTAATATATATAAAAATTAATTATA 137239 

Qy 4557 ccttgaatcatatgacactg-gtgcatgtgccatcatcatgcagtaatttcatggtatat 4615 



I II llll I II II I I II I I II I I II I 

Db 137238 ATATAAAATATATATAATAATGTATATATAAATTGATATTATAATATGTAATTATTAAAG 137179 

Qy 4616 cgtaatatatagttaataaaaaagatggtgattgggaaatgtgtgtgtgcattcctccat 4675 

llll I II I I III I II I I I I I I II 
Db 137178 TGTAAAAGATGATAATTAATTATATATTATATAATAATAGTTTATTATAATATGTAATM 137119 

Qy 4676 gcactaatggtgaatctctttgcatacatagaaattctaaatggttatagtttatgttat 4735 

llll I I I II I I II II llll II 
Db 137118 AITATAAT — TAAATGTATGTTATTTGATGATTTAGTATTTAATTAAATATGAATAAA 137063 

Qy 4736 agtgtatgttgtagtgaaattaattttaaatgttgtatctaatgttaacatcacttggct 4795 

I Mil I llll II Mil I I I I II II I 
Db 137062 AATGTATATATATTATAGAATATAAATATATATTTTTTTATAAATAAATATATAAAGAAA 137003 

Qy 4796 tgatttatgttatgttatgtattttactttaatgatattgcatgtattgttaatttaaca 4855 

II III llll II III I I I II I II I I II I 

Db 137002 GAATATATTTTATTATAGTTATATATGTATTATATAGTGATTTTTTAAATATTATAAATA 136943 

Qy 4856 ttgcttgatcattatactcttctactattaattataaatggcactgttttgtttaaactt 4915 

I I II II II II I iiimm I II I I III I 

Db 136942 ATATAGTCTIATATTATATTTTATAIAGTTATTATAAATAATATGATTATTTATAATAAT 136883 

Qy 4916 tttacaagttaagacatgtataaatatatgacaatataattacaggttttagttcaatgt 4975 

I II II I II III I I llll I llll I' 

Db 136882 ACTTAAATATATAATATATATTATATATAATATAAGTAATAATTATATTTATAATTIAGA 136823 

Qy 4976 tagctatcttagtatgttattgatgatcttaattacatttaaacaaattccacttaaaat 5035 

II I I I I I II I II I II II II I III I III II 

Db 136822 TAATTTTATAAAATTAATAAAATTAAITATTATAGTATGTATAT-AATGAAATATAATAT 136764^ 

Qy 5036 tttaataaataataacaaataattattgtaatataatacattaaatgcaacaaaaaatga 5095 

I I I mill III I I llllllllll II I I II I I I 
Db 136763 GATTAATACTAATAAATTATATATTATATAATATAATATATAATAAATAAATATATTATA 136704 

Qy 5096 aataaataaaataaaatagcaaataattgttataatattgtaatataatatgtaccatat 5155 

llll II lllll MM II lllll III I I 
Db 136703 ATATTAGATATATTTGTTATATATAATAGATTATAAAGACGAAAATAATGAGTATATTTT 136644 

Qy 5156 tcttaactgaaatagggt-ctaacctataatccctaaaatttcagtttaaatatttt-- 5211 

.III I I I II I llll I I Ml III II llll 
Db 136643 AGTTATATATATGGGAGTAGTAATAATATATTAAGTGTAGTTATAGTATATATATGATAG 136584 

Qy 5212 -tatacctaccatattattagaactctttttaaatatattaaaattttaattataccaat 5270 

I I II I I I II II I II I III III II I I 
Db 136583 GAAAATATAATAGTTAAATATAAGTATATATCTATAATATAATATATAGTATTATAGGIA 136524 

Qy 5271 ttaattaaactattaattatcttaactaaaatctaaaattttatttaacctattaataaa 5330 

II I II I llll II I I III llll I lllll II llll 
Db 136523 TTGAATATATAATTATAAATATATATTAATCAGAAAAAATATATTTTGATTAGTAATTTT 136464 

Qy 5331 ttcctaattatcttatctaatttaaaactctaattatcctaatttaatttaaattcttaa 5390 

I I lllll I I II II II II llll 
Db 136463 ATAATCATTATAGTTATTTATATATAATGATAGTTATNNNNNNNNNNNNNNNNNNNNNNN 136104 

Qy 5391 ttatcttaatttgtaacctcctccacccagctagatgctggacccgaatccgggagatta 5450 

II 

Db 136403 NNNNNNNNNNNNNNNNNNNNNWtTONNNNNN^^ 136344 

Qy 5451 cat 5453 

II 

Db 136343 TAT 136341 



RESULT 10 
DMD11584 

LOCUS DMU11584 4601 bp DNA 

DEFINITION Drosophila melanogaster Orec 

ACCESSION U11584 

VERSION 011584.1 GI: 508826 

KEYWORDS mitochondrial DNA; A+T region; tandem repeats 

SOURCE fruit fly. 

ORGANISM Mitochondrion Drosophila melanogaster 



INV 23-JUL-1994 
mitochondrial A+T region. 
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Butenhoff,C, Champe,M., Chavez, c, Chew,M,, Ciesiolka,L., 
Doyle,C.M., Farfan,D.E., Galle,R. , George,R.A., Harris, N.L., 
Hoskins,R,A., Houston, K. A., HuMiasti,S.R,, Rarra,R,, Keamey,L., 
Kira,E. , Lee,B., Lewis, S., Li, P., Lomotan,M,A., Mazda,P., 
Moshrefi,A.R., Moshrefi,M., Nixon,K,, Pacleb,J.M,, Park,S., 
Pfeiffer,B., Poon,L., Sequeira,A., Sethi, H., Snir,E., 
Svirskas,R,R., Wan,K.H., Weinburg,T., Zhang, R. , zieran,L,L, and 
Rubin,G.M. 
TITLE Direct Submission 

JOURNAL Submitted (29-JUL-1999) Drosophila Genome Center, Lawrence Berkeley 
Laboratory, MS 64-121, Berkeley, CA 94720, USA 
COMMENT On Mar 8, 2000 this sequence version replaced gi: 7025688. 

For further information about this sequence, including its location 
and relationship to other sequences, please visit our sequence 
archive Web site (http://www.fruitfly.org/sequence/) or send email 
to bdgpGf ruitfly.berkeley.edu. All contigs in this submission meet 
the following cutoffs: length >- 200 bases. 
NOTE: This is a 'working draft' sequence. It currently 
consists of 133 contigs. The true order of the pieces 
is not known and their order in this sequence record is 
arbitrary. Gaps' between the contigs are represented as 
runs of N, but the exact sizes of the gaps are unknown. 
This record will be updated with the finished sequence 
as soon as it is available and the accession number will 



be preserved. 
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9 bp in length 



21127 
21827 
21907 
23736 
23816 
25557 
25637 
26793 
26873 
28360 
28440 
29899 
29979 
31837 
31917 
33348 
33428 
34569 
34649 
35755 
35835 
37816 
37896 
39642 
39722 
41136 
41216 
42478 
42558 
44230 
44310 
45923 
46003 
48000 
48080 
49983 
50063 
51361 
51441 
53102 
53182 
54927 
55007 
56938 
57018 
57607 
57687 
58633 
58713 
60614 
60694 
62728 
62808 
65312 
65392 
66686 
66766 
68831 
68911 
71104 
71184 
72194 
72274 
74139 
74219 
76237 
76317 
77914 
77994 



82777 
82857 



44229 

44309: 
45922: 
46002: 
47999: 
48079 
49982: 
50062: 
51360: 
51440: 
53101: 
53181: 
54926: 
55006: 
56937: 
57017: 
57606: 
57686: 
58632: 
58712: 
60613: 
60693: 
62727: 
62807: 
65311: 
65391: 
66685: 
66765: 



21826: contig 
21906: gap of 
23735: contig 
23815: gap of 
25556: contig 
25636: gap of 
26792: contig 
26872: gap of 
28359: contig 
28439: gap of 
29898: contig 
29978: gap of 
31836: contig 
31916: gap of 
33347: contig 
33427: gap of 
34568: contig 
34648: gap of 
35754: contig 
35834: gap of 
37815: contig 
37895: gap of 
39641: contig 
39721: gap of 
41135: contig 
41215: gap of 
42477: contig 
42557: gap of 
contig 
gap of 
contig 
gap of 
contig 
gap of 
contig 
gap of 
contig 
gap of 
contig 
gap of 
contig 
gap of 
contig 
gap of 
contig 
gap of 
contig 
gap of 
contig 
gap of 
contig 
gap of 
contig 
gap of 
contig 
gap of 
contig 
gap of 
contig 
gap of 
contig 
gap of 
contig 
gap of 
contig 
gap of 
contig 
of 



68910: 
71103: 

71183: gap 
72193 

72273: gap 
74138: 

74218: gap 
76236: 

76316: gap 

77913: con' 

77993: gap 

80808: contig 

80888: gap of 

82776: contig 

82856: gap of 

85682: contig 



of 700 bp in length 

unknown length 

of 1829 bp in length 

unknown length 

of 1741 bp in length 

unknown length 

of 1156 bp in length 

unknown length 

of 1487 bp in length 

unknown length 

of 1459 bp in length 

unknown length 

of 1858 bp in length 

unknown length 

of 1431 bp in length 

unknown length 

of 1141 bp in length 

unknown length 

of 1106 bp in length 

unknown length 

of 1981 bp in length 

unknown length 

of 1746 bp in length 

unknown length 

of 1414 bp in length 

unknown length 

of 1262 bp in length 

unknown length 

of 1672 bp in length 

unknown length 

of 1613 bp in length 

unknown length 

of 1997 bp in length 

unknown length 

of 1903 bp in length 

unknown length 

of 1298 bp in length 

unknown length 

of 1661 bp in length 

unknown length 

of 1745 bp in length 

unknown length 

of 1931 bp in length 

unknown length 

of 589 bp in length 

unknown length 

of 946 bp in length 

unknown length 

of 1901 bp in length 

unknown length 

of 2034 bp in length 

unknown length 

of 2504 bp in length 

unknown length 

of 1294 bp in length 

unknown length 

of 2065 bp in length 

unknown length 

of 2193 bp in length 

unknown length 

of 1010 bp in length 

unknown length 

of 1865 bp in length 

unknown length 

of 2018 bp in length 

unknown length 

of 1597 bp in length 

unknown length 

of 2815 bp in length 

unknown length 

of 1888 bp in length 

unknown length 

of 2826 bp in length 
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RYGGIYNSRyMTIEKRRRYNYIRKQKNNNNSYFSYCNVYKNNDVNMTAYNYYINNP 
KNKLKEYYEKI KNHVIKKKKK IFSLKFSQNKRNEKKKKYFFINFTS PHDIKDNIKVLF 
NIDnENNIIYSYIKSFKRTPPITKIYLLGTFLLSVLIHMNKNVYKLILFDFNKIFKK 
GEIWRLFTPYLYIGNLYLQYILMFNYLNIYMSSVEISHYKKPEDFLIFLTFGYISNLL 
FTIWANMYNENIMNVKLYIHNFKN7FIKDCVSKYTSRSSTNNNSNNINSNNRSSNNNN 
HYNNSKNIDIKKEQYNHLGYVFSTYILYYWSRINEGTLINCFELFFIKAEYVPFFFII 
QNILLYNEFSLYEVASIFSSYLFFTYEKYFKFNYLRLFFKTLLKVIHIYPLYDRFQMN 
TKLLISILKKRKNGPLPFLQKCYIHNVPRINTMKHMNISDLRKNIEVCKKKNVKHKNV 
WSNFLYIIILKLFNKEIKNYVDFMIILKLLSKYIKIEKKVLLYICEQIEHEIYKFRTR 
DLTLLILILRKNNFDNIYYINLISKSILMKMNKNMSYKDLALIIYSLSKNIYLTDEQI 
YNKEIFNFSILKFENHLNNVNINLHSLSLFFYSYSVYFINNCFYYYYYFHSFFNIITK 
FINIINKNLHLYNSTDLMFLYIGSLHIHNMYTPNHVDQNKEPKNNQKENNNYHNDNHN 
IYLKNINNNCYDHRLDSNDFITMTNYDQGEYNKHIQONKHIQQNKHIQQNKHIQQNKH 
IQRIGTHCTESNSNNQQLIQIQNDEKENRLITYDNSRHNLLRDPCQHNIVERDGERKQ 
NLIKNLIINIKKIIEEKLSSFKIOEIVNILFVSLNKNIIINKKYFHFLNQEKINIRNY 
INIYVNINKIYLNDEEENTSHCILKIKNDNKKDILYHDHMRFLYNLMNEIIYRNDLLN 
MKQIILLLYGLKFNNFMFLQFEKIILKRFICLPKKEIQKIGKEEIMFLYQYFFVRTCL 
FNELKKQNNLFISQDEYENYIYISDKYNESAKLDNSYNMPSNLKEKNTNHHGGKDNTL 
DLYIHDDLFYMNRNKRRDRYRIYLYDNFIFNYPAYYVEQKRDHIDYNESVNNFDNMKS 
FIQLKKKKINKINNNNNNNNNNNNNIYIDTNIQTVNKNYSCTHNNVIKNETNDNYPNS 
TIRNQHPNDQVILNNPVFFYNKRLNWDSIDFEYELTCYNLYLDIYRIVCLRLLTLLK 
NHKLSCLQS IDI LC IYEKLNI RDYRI I KYLYNLKKELLYLDNTYLLKVINI IVKFNLY 
NMISYLQINKILTFINYNNIKESIQILKLIGMLISVHRHNKLSPFHMNNLNVQNAANY 
LFKNLYNLQNIODLKKIEMMNVYDNLTFKFYKLFKNILSINVKRYVQNCNSYNKYEMN 
THTNNLffilffiQHKYIHHMJHKDGRHNNNNlJHYDKVDVSSSSSSSrYYYLNKSGKNLG 
NINVQNLDDININKIKSISYKIKKDQIKDIGYMRVSKYSELMKSMKMMNYDEHFNDEY 
RNVCDEIYEDLFLIYNKNIQVYKNINICNYTFPMAINLLTLNNDENILININKSDDNK 
KLIKVDKKKFLIVDILYNYDYYYTLTKSKLDKLKEYNIYLSYYSNHIKKKNKKILNYK 
KYALLKLIKKRGFNY1CIDADTYVKNKKGKSKDLSYEINKLYINNLILDILKRQKKNH 
LHPHPHTQNRTTKQIKNINIKNKLLLYHQNKKNVKKIIHFKNYKYKIMNLPDQRNHYH 
NKRIRYIKDKSLLAINHKTRNIIEKQRISTSNHLSRLRRMFSL" 

gene complement(20528 . .21454) 

/gene-"MAL3P5.5« 

CDS complement(20528. .21454) 

/gene-"MAL3P5,5" 

/note="predicted using hexExon; MAL3P5.5 (PFC0595C), 

Serine/threonine protein phosphatase (PP2), len: 309 aa; 

Similarity to serine/threonine protein phophatases. 

M.domestica serine/threonine protein phosphatase 

(TR:Q42912) BLAST Score: 1005, sum P(l) • 6.9e-107; 60% 

identity in 301 aa overlap." 

/codon_start-l 

/protein id-"CAB38970.1" 

/db_xref-"GI: 4493934" 

/dbjcref-"SPTREMBL:097259" 

/translation-'MMGEERKWIEQLRMNPPKLLDESDLRLVCQRVKEILVEENNVQ 
SIKPPVIICGDIHGQFFDLLELFDVGGDIMNNDYIFIGDYVDRGYNSVETFEYLLLLK 
LLFPRNITLLRGNHESRQITTVYGFYDECFRRYGNANAWKYCTDIFDYLTLAALVDNQ 
IFCVHGGLSPEIRLIDQLRLINRVQEIPHEGAFGDIMWSDPDEVDDWVANPRGAGWLF 
GPNVTKRFNHINNLELIARAHQLAMEGYRYMFEDSTIITVWSAPNYCYRCGNVAAIMR 
IDEYMNRQMLIFKDTPDSRNSIKNKATIPYFL" 

gene 25252. .26157 

/gene-"MAL3P5.6" 

CDS join(25252, .25296,25453, ,26157) 

/gene-"MAL3P5,6" 

/note-"predicted using hexExon; MAL3P5.6 (PFC0600w), 

Hypothetical protein, len: 250 aa" 

/codon_start-l 

/protein_id*"CAB38972.1" 

/dbjcref-"GI:4493936" 

/db jcref - " SPTREMBL : 09 7 261 " 

/translation-'MRKYLNRYMYIYNIYNRLEEKYKNFLRLRNMNSHMGASQNMNVN 
NNYTMNELEEFEKINNNYNNNNNNINNNINNYYDYMNIKVSQSVQHNKRLQDFYNNKN 
SFQHYIKKLKTCRFDADDIRKLLEKRLAYERDNTLIRNIQEEENKRGIGINGNFGSES 
NSSSSNYDNKYLLYRKINRLNKTNTNKSKNRSRKRKRINSKIDKKYIIKCRACKFINP 
NGFRIEDYYTCQNCGYNDFSVIRSTSPNNAD" 

gene 27547. .28290 

/gene-"MAL3P5,7" 

CDS 27547. .28290 

/gene-"MAL3P5,7" 

/note- "predicted using hexExon; MAL3P5.7 (PFC0605c), 
Hypothetical protein, len: 248 aa" 
/codon_start-l 



gene 
CDS 



/protein_id-"CAB41709.1 n 
/dbjcref-"GI: 4725991" 
/(Jbjtref-"SPTREMBL:Q9Y011" 

/trans lation-'MGGHGGLNILPQKKWNVYRRDAQYRVHYDEHRIIKEEKDKEIKR 

RRDEFESTISTLKKNMTKNEDSDNNYNNFYDENGERKTTTNYCNDHINLFIDEERELT 

AKQRRHEEFLIKKGHYIYYDRNFNTQHNSIYDKNRNAQIISDFNKMKLCERDWFLNKR 

NKNEKTKDNGANFFHIQKDNISEEHNKTENINSDLSLYCNTNNYITHDKKKEKKQMHY 

HIKKIIKYKQEKDKEKKRKRQGKEKKKPK " 

complement(29992. .33537) 

/gene-"MAL3P5.8" 

complement(29992. .33537) 

/gene-"MAL3P5.8" 

/note- "predicted using hexExon; MAL3P5.8 (PFC0610c), 

Hypothetical protein, len: 1182 aa" 

/codon start-1 

/protein_id-"CAB38971.1" 

/dbjcref-"GI: 4493935" 

/dbjcref-"SPTREMBL:O97260" 

/trans lation- "HAHKVRKEKKTEAQETPWAREQTHMEENNESNIAVTEENVIS 
KNGQEIAISRNDQEIAISRNDQEIAISNNDQEIAISRNDQENVALNSSEERQNASKEE 



DSNNVETIENAITNDVLRSNRSTSYSKQKNELTSVTCYVCGETVDLNIWSDHIFAHKL 



Query Match 3.5%; 

Best Local Similarity 43.8%; 
Matches 1064; Conservative 



Score 192.6; DB 33; Length 86829; 
Pred. No. 4.1e-14; 

0; Mismatches 1354; Indels 9; Gaps 5; 



Qy 1566 tgtgtttttctgaaaaatattgcattaacataatcatgcattctcaattttggtcaattg 1625 

III! I III III llll I III II I I I II II 

Db 39027 TATTTTAAAATAAAATATAAATATTTAATAAAATAATAAAAAAAAATATATGTAATAGTT 39086. 

Qy 1626 aacgttataaaattctctatgatatcctgatctgtttattacattatatgtgtttatgct 1685 

I Mill III I II III I I I I I 

Db 39087 ATATATATAATATTAAATTAATATAAATTAATATAATAIAATAAATAAATAATAATATAT 39146 

Qy 1686 tgagttaagtcaaacattgagattcatagctcacccaattatttaatcatttcaggcaat 1745 

I I II I I I I I III I I III I I I llll K 
Db 39147 ATATTAAATAAATAAAATAAACAAAATAAATTAMITATTTTAAATTAATTAAATAAATA 39206 

Qy 1746 ctgcagacttaggattggatggcgttcaggagcttggattggttttctcacatcatattt 1805 

III llll III I 1 1 1 1 1 I I Ml 
Db 39207 AAATATATTATTTATTAAAATAAATAAATTAATATATATTATTTATTAAAATAAAAATAA 39266 < 

Qy 1806 tattaaataattattaattaaaatttatggacttttggactgtctgactaattttcagaa 1865 

I I III llll llllllll I I II I I I! I II 
Db 39267 ATTAATATA1ATATTTATTAAAATAAAAATAAATTAATATATATTATTTATTAAAATAAA 39326 

Qy 1866 ttttattttggttttgggttttgttgaattttttagataattattttaaatattctgcat 1925 

I I II II III II II I llllllll llll I 

Db 39327 AATAAATTAATATATATTATTTATTAAAATAAAAATAAATTAATATAIATTAITTATTAA 39386 

Qy 1926 aatttttctgttatttgaaaaggatgttcgaattttttttcaaaattgaaacgtttaaga 1985 

III llll II I I Ml I I I I I 
Db 39387 AATAAAAATAAATTAATATATATTATTTATTAAAATAAAAATAAATTAATATATATTATT 39446 

Qy 1986 atttttactactgcaaattcagaataagtgaatttgttttttagaaagattaaataagtt 2045 

II I II llll III. I I III II III I Mil 

Db 39447 TATTAAAATAAAAATAAATTAATATATATTATTIATTAAAATAAAAATAAATTAATAATT 39506 

Qy 2046 agtattacgatttttagtttgatttggtggaaagtaatgtatgtttttgaacataattat 2105 

I II I I II I I III II I II I Ml I II 
Db 39507 AAAATAAATAAATTAATATATATTATTAATTAATATAAATAATAAATAAATAATTAAAAT 39566 

Qy 2106 ttgacaataattaagttttctagggaataaacggaaatatcttcttcttttttgtaaaat 2165 

III I II I II I III llll I I I I III 
Db 39567 AAATAAATTAATATATATTATTAATTAAAATAAATAATAAATAAATTAATAATTCAAATA 39626 

Qy 2166 tactaatgcaagaacaaacaacgttttggggagcaaataatctagctttaagtagtcagt 2225 

I II Ml I I I II I II I I II I I II I I 
Db 39627 AATATATTATATAATATATATTAATTAAATTAATAATTTAAATAAATAAATGTTTTTTAT 39686 

Qy 2226 g--taactctcaaaatctggtcataacttctaggctgagtttgctgtgctacagtagta 2282 

III I I Mil II II I I II I 
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FEATURES 

source 



BASE COUNT 
ORIGIN 



1 67262: contig of 67262 bp in length 
» 67263 67462: gap of unknown length 
» 67463 82485: contig of 15023 bp in length 
» 82486 82685: gap of unknown length 
» 82686 130281: contig of 47596 bp in length. 
Location/Qualifiers 
1. .130281 

/organism-'Plasmodium falciparum" 
/dbjcref-"taxon:5833" 
/chromosome- "12" 
/clone- "3D7" 

52250 a 11780 c 11855 g 53996 t 400 others 



Query Match 3.6%; 
Best Local Similarity 45.7%; 
Matches 1118; Conservative 



Score 197; DB 60; Length 130281; 
Pred. No. 1.2e-14; 

); Mismatches 1290; Indels 37; Gaps 11; 



Qy 1564 attgtgtttttctgaaaaatattgcattaacataatcatgcattctcaattttggtcaat 1623 

III Mil I II III I Mil I III II III I I I I 
Db 101320 ATTTTATTTATTTCATTAAAAAAGGATAAAGACAATAATAAATTATAATAATAAAAAACA 101261 

Qy 1624 tgaacgttataaaattctctatgatatcctgatctgtttattacattatatgtgtttatg 1683 

I llll I I II II III II II II I II II 

Db 101260 TTCCAAATATATACCCCCAAATATATATTATATATGTATAAGACTATACAAATATATAAA 101201 

Qy 1684 cttgagttaagtcaaacattgagattcatagctcacccaattatttaatcatttcaggca 1743 

I I I I I II III I I llll 

Db 101200 TATAATAATCACATATTAATATAATATATATTTATTAATTTATAAAATAAATAAAAATAT 101141 

Qy 1744 atctgcagacttaggattggatggcgttcaggagcttggattggttttctcacatcatat 1803 

II I I I III II II I II I I I II I 

Db 101140 ATATATATAATTATATAATATATCAAATTAAATCATTATAAAATTTATTTAAAATATATT 101081 

Qy 1804 tttattaaataattattaattaaaatttatggacttttggactgtctgactaattttcag 1863 

llllllll III Mill Ml I II llll I 

Db 101080 AAAATTAAATATATATATATTAATAAATAATTAAGTTAATTATTTAATAAATAAAAATAA 101021 

Qy 1864 aattttattttggttttgggttttgttgaattttttagataattattttaaatattctgc 1923 

I I llll III I I I I I llll II I I I I II 
Db 101020 TAATAAATTTAAATATTAATATAATTAAATTCATAATACACATTAATTAATAAAATATGA 100961 

Qy 1924 ataatttttctgttatttgaaaaggatgttcgaattttttttcaaaattgaaacgtttaa 1983 

III I I II I I I II III III II III II I II 

Db 100960 ATATTAATATAAATAATAAATAGAAAAATATTAATACAATTTAAATATTAAATAAATAAA 100901 

Qy 1984 gaatttttactactgcaaattcagaataagtgaatttgttttttagaaagattaaataag 2043 

I II II I II I llll II II III II II Mill 

Db 100900 AATATTATAATTTATAAATAATAAAATATTAATATAAATTAATTAATAATATATAATAAA 100841 

Qy 2044 ttagtattacgattttt—agtttgatttggtggaaagtaatgtatgtttttgaacata 2100 

III III I I III II III III Mil Ml III 
Db 100840 TTAATATAATTAAATTTAAAATATAAATTAATAAAAAAATAATACTAATATTAATATAAA 100781 

Qy 2101 attatttgacaataattaagttttctagggaataaacggaaatatcttcttcttttttgt 2160 

II I I I Mill III llll III II llll III I I I 

Db 100780 ATAAAATAATAATAAATAAATTTTAATTAAAATTAAAT • • AATAAAATATTAATATAAAT 100723 

Qy 2161 aaaattactaatgcaagaacaaacaacgttttggggagcaaataatctagctttaagtag 2220 

II I I llll I II III I I I I I II II I I 

Db 100722 TAATTAAATAATAATATAATAAATTAATTAAATAATAATATAATAAATTAATTAATTAAC 100663 

Qy 2221 tcagtgtaactctcaaaatctggtcataacttctaggc—tgagtttgctgtgctacag 2277 

I Ml I llll I I llll II I I I II I llll 

Db 100662 AATATTTAAATAATTAAATATAATAATATATATTAAACAATTAATTATTATAAATTAAAG 100603 

Qy 2278 tagtaagtctatagaaacttacctgacaaaacgacatgacgtcagggtcgaatctacaac 2337 

III II I I III I II II II I Mill 

Db 100602 AATTATTAATAATAATATATAAATTAATTAAATAAAATGAATTATTTAAATAATTAAAAC 100543 

Qy 2338 --ttttcctttttcttcaattaacatatggttgattcaagttccgatctataataattta 2395 ; 



I I I I I Mill I I I llll I I III III 

Db 100542 AATAATATATATAAATTAATTATATATTTAGTAAATAAAATAAAATTAATAATTAAATTA 100483 

Qy 2396 ttacgatttatcaatttcaattaccttatatcatcctattataaatataagtcagttcaa 2455 

M I II II II II I II llll I MUM II II II 
Db 100482 ATAATTTATAGTTATAAAAAATAAAAATAAATATATAATTAAATATATAAATAAATTAAA 100423 

Qy 2456 ttcagttttcgaaagttcccaaaaattttgaattttattaaatttattccctaaaaccga 2515 

I I II II I Mill llll Mill Ml I I 

Db 100422 TAAATATATAAATAAATTAAATATATATACAATTAAATTAAATATATATATAATTAAATT 100363 

Qy 2516 aatagttatatctttcaaatttaagtttcatttttcaatccgatttcaatttcatccttt 2575 

II I I Mil II MM I II III I 

Db 100362 AAATATATATATAATTAAATTAAATATATATATAATTAAATTAAATTAATAAATAAAATA 100303 

Qy 2576 tataactctctattatctataattacataaatttcaaattaattttgaaatatttacact 2635 

II I I M II lllllill II Ml I II I II I II II I 

Db 100302 AATTAATAAATAAAATATATAATTAAATTAAATAAATATATAATTAAATTAATATATAAA 100243 

Qy 2636 ttagtccctaagttcaaaactataaattttcactttagaaattaatcatttttcacatct 2695 

III II I II HUM I III llllllll III I I II 

Db 100242 TAAAT ■ -TAAATATATATATAATTAAATTATATTATATAAATTAATAATATATAATATAA 100185 

Qy 2696 aagcatcaaatttaaccaaatgacacaaatttcatgattagttagatcaagcttttg--- 2752 

I II III llll llll I llll II II 

Db 100184 TAAATAAAATAATAATTAAATAATAATATATAATATAATAAATAAACAATAATTAAATTA 100125 

Qy 2753 agtcttcaaaacataaaaattacaaaaaaaaaacaaacttaaaatcatttatcaa 2807 

I I III III I! Ill III III Mill Ml II 
Db 100124 ATAATATATTATAAATAATATAATAATTAAATATAATATAATAATTAATTAAATTAATAA 100065- 

Qy 2808 tttgaacaacaaagcttggccgaatgctaagagcttaaaaatggcttcttttgtttcttt 2867 

I I M III III III MM I I llll I 

Db 100064 TATATTAAATAAAATAATAATAAATATTAAACAATTAAATAAAATATACATAATTAATAT 100005 

Qy 2868 ttgttgcaaacggtggagagaagagggaaatgaagattgaccatatttttttattatgtt 2927 

II II I III I II Ml I Ml II III I 

Db 100004 TAATAAATAATTATTATATTAAAATAATTAAAAAAATTAATTAATTTAAGATAATATATA 99945 

Qy 2928 ttaacatataatattaataatttaatcataattatactttggtgaatgtgacagtgggga 2987 

II Mill Mil! Ill I II II Mill I 

Db 99944 ATATTATATATAATTAATTATTAATAAATGTTTTATATTTAATTTAATTACACAATTAAA 99885 

Qy 2988 gatacgtaaagtattttaacattatactttttgcaagcagttggctggtctacccaagag 3047 

II III I Ml I II I II I II II 
Db 99884 TTATTTTATTTTATATATTAATTAAGTTAATTAATAAATGGTTATTTTTATTTTAATTAA 99825 

Qy 3048 tgatcaaagtttgagctgccttcaatgagccaatttttgcccataatggataaaggcaat 3107 

II II I HIM II II III III II II 
Db 99824 AATGTGAAATATATTATTTATTATATTAAAAAAAATATATATATATTTAATTAATTAATT 99765 

Qy 3108 ttgtttagttcaactgctcacagaataatgttaaaatgaaattaaaataaggtggcctgg 3167 

II I I I I II III llllllll III I 
Db 99764 ATATGTTTATATTTTAATTTAAAATAAAT — ATATTTTATAAAATTAATTATTTATTA 99709 

Qy 3168 tcacacacacaaaaaaaaactaatgttggttggttgaattttatattacggaatgtaata 3227 

I I I llll II III II I llll III I Ml 
Db 99708 AATAAATAAAATAAAATAATAAATATTATTATAATAAATAACATAAAATTATTTATTACT 99649 

Qy 3228 ttatattt-taaaataaaattatgttatttagattcttaatattttggagcattccata 3285 

MIMM I llllllll III III llllllll II MM 
Db 99648 ATATATTTAATTAAATAAAAGTATAAAATTATCTTTCTTAATTTATATTTTAAATTAATT 99589 

Qy 3286 ctataatttcgtaacataatattaaaatatagtaatataaagtgtaattaactttaaatt 3345 

II I II I III III I III MIMI II I 
Db 99588 ATTAATATATATATATAATTATAAAATAAATTATTATTTTAATTTAATTAATAATATTAT 99529 

Qy 3346 acaagcataatattaaattttgaatcaattaatttttatttctattattttaattaattt 3405 

I II I MM I II III Ml III MM I III I M 
Db 99528 ATAATATTTATATTATTTGTTTAAT ATTTAATTACTATTAAATATATTTAATT 99476 

Qy 3406 agtctattttttcaaaataaaatttaaatctaaataaaaataatttttccttaatgttga 3465 

I I III III I II I I MIIMIMI II II II II 
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KEYWORDS . 
SOURCE Unknown. 
ORGANISM Unknown. 

Unclassified. 
1 (bases 1 to 1283) 
John,M. 

Genetically engineering cotton plants for altered fiber 
Patent: US 5620882-A 17 15-APR-1997; 
Location/Qualifiers 
1. .1283 

/organism- "unknown" 
509 a 233 c 251 g 290 t 



AUTHORS 
TITLE 
JOURNAL 
FEATURES 

source 



BASE COUNT 
ORIGIN 



Query Match 4.81; Score 265.4; DBS; Length 1283; 

Best Local Similarity 84.3%; Pred. No. 6.2e-22; 

Matches 316; Conservative 0; Mismatches 46; Indels 13; Gaps 1; 

Qy 4132 aatacacgttcttttctttctatttgattaaccatggctcatagcattcgtcaccctttc 4191 

I II I Mill! Illlllllll I llllllllllllll I I! I III llllll 
Db 13 ACTAAAAATTCTTTGCTTTCTATTTTGTAAACCATGGCTCATAACTTTTGTCATCCTTTC 72 

Qy 4192 ttccttttccaacttttactcataagtgtctcactagtgaccggtagccacactgtttcg 4251 

lllllllllllllllllllllll I Illlllllll I I llllll Mill II III 
Db 73 TTCCTTTTCCAACTTTTACTCATTACTGTCTCACTAATAATCGGTAGTCACACCGTCTCG 132 

Qy 4252 gcagcggctcgacgtttattcgagacacaagcaacctcatcagagctcccacaattggct 4311 

minium imm immi mmmiimii iiiiiimm 

Db 133 TCAGCGGCTCGACATTTATTCCAGACACAAACAACCTCATCAGAGCTGCCACAATTGGCT 192 
Qy 4312 tcaaaatacgaaagcacgagagtctgaatacgaaaagccagaatacaaacagccaaagta 4371 

imiiiiiiiii i ii ii ii i minimi urn n 

Db 193 TCAAAATACGAAA AGCACAAAGAGTCTGAATACAAACAACCAAAATA 239 

Qy 4372 tcacgaagagtactcaaaacttgagaagcctgaaatgcaaaaggaggaaaaacaaaaacc 4431 

iiiiii urn mm iimiini imm iimiiimmimii 

Db 240 TCACGAAAAGTACCCAAAACATGAGAAGCCTAAAATGCACAAGGAGGAAAAACAAAAACC 299 
Qy 4432 ctgcaaacagcatgaagagtaccacgagtcacacgaatcaaaggagcaaaaagagtacga 4491 

imimi iiiiiiiimiiiiiiiim iiiiii mum iiiimm 

Db 300 CTGCAAACATCATGAAGAGTACCACGAGTCACGCGAATCGAAGGAGCACGAAGAGTACGA 359 
Qy 4492 gaaagaaaatctcga 4506 

immi i m 

Db 360 TAAAGAAAAACCCGA 374 



AC005504 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 
ORGANISM 



TITLE 
JOURNAL 



AC005504 104992 bp DNA HTG 01-APR-1999 

Plasmodium falciparum chromosome 12, *** SEQUENCING IN PROGRESS 
***, 3 unordered pieces. 
AC005504 

AC005504.3 GI:4558584 

HTG; HTGS.PHASE1. 

malaria parasite P. falciparum. 

Plasmodium falciparum 

Eukaryota; Alveolata; Apicomplexa; Haentosporida; Plasmodium. 
1 (bases 1 to 104992) 

Hyman,R.W., Fung,E.L., Qin,F., Tamaki,T., Kurdi,0,B., Conway, A. B. 
and Davis ,R,W, 

Plasmodium falciparum 3D7 chromosome 12 



AUTHORS 

TITLE 

JOURNAL 



2 (bases 1 to 104992) 

Hyman,R.W., Qin,F., Fung,E.L,, Conway, A, B. and Davis, R.W. 
Direct Submission 

Submitted ( 21-AUG-1998 ) Stanford DNA Sequencing and Technology 
Center, Stanford University, 855 California Avenue, Palo Alto, CA 
94304, USA 

On Apr 2, 1999 this sequence version replaced gi: 4337172, 

* NOTE: This is a 'working draft' sequence. It currently 

* consists of 3 contigs, The true order of the pieces 



FEATURES 
source 



BASE COUNT 
ORIGIN 



» is not known and their order in this sequence record is 

* arbitrary. Gaps between the contigs are represented as 

> runs of N, but the exact sizes of the gaps are unknown. 

* This record will be updated with the finished sequence 

» as soon as it is available and the accession number will 

> be preserved, 

» 1 58642: contig of 58642 bp in length 

* 58643 58842: gap of unknown length 

' 58843 91011: contig of 32169 bp in length 

> 91012 91211: gap of unknown length 

> 91212 104992: contig of 13781 bp in length. 

Location/Qualifiers 
1. .104992 

/organism-'Plasmodium falciparum" 
/db_xref-"taxon : 5833" 
/chromosome- "12" 
44286 a 9326 c 9564 g 41411 t 405 others 



Query Match 3.6%; Score 197; DB 41; Length 104992; 

Best Local Similarity 45.7%; Pred. No. l,2e-14; 

Matches 1118; Conservative 0; Mismatches 1290; Indels 37; Gaps I 

Qy 1564 attgtgtttttctgaaaaatattgcattaacataatcatgcattctcaattttggtcaat 1623 

III I III I I I III I II II I III II III I I I I 
Db 72352 ATTTTATTTATTTCATTAAAAAAGGATAAAGACAATAATAAATTATAATAATAAAAAACA 72411 

Qy 1624 tgaacgttataaaattctctatgatatcctgatctgtttattacattatatgtgtttatg 1683 

I II I I II II III II II II I I I II : 
Db 72412 TTCCAAATATATACCCCCAAATATATATTATATATGTATAAGACTATACAAATATATAAA 72471 

Qy 1684 cttgagttaagtcaaacattgagattcatagctcacccaattatttaatcatttcaggca 1743 

II I I I II III I I I II I 

Db 72472 TATAATAATCACATATTAATATAATATATATTTATTAATTTATAAAATAAATAAAAATAT 72531 

Qy 1744 atctgcagacttaggattggatggcgttcaggagcttggattggttttctcacatcata't 1803 

II II I III II II I II I I I II "I 

Db 72532 ATATATATAATTATATAATATATCAAATTAAATCATTATAAAATTTATTTAAAATATATT 72591 

Qy 1804 tttattaaataattattaattaaaatttatggacttttggactgtctgactaattttcag 1863 

IMIMI III Hill I II I II I I I I I 
Db 72592 AAAATTAAATATATATATATTAATAAATAATTAAGTTAATTATTTAATAAATAAAAATAA 72651 

Qy 1864 aattttattttggttttgggttttgttgaattttttagataattattttaaatattctgc 1923 

I I MM I II I I I I I INI II I I I I II 

Db 72652 TAATAAATTTAAATATTAATATAATTAAATTCATAATACACATTAATTAATAAAATATGA 72711 

Qy 1924 ataatttttctgttatttgaaaaggatgttcgaattttttttcaaaattgaaacgtttaa 1983 

III I I II I I I I I III III II III il I II 

Db 72712 ATATTAATATAAATAATAAATAGAAAAATATTAATACAATTTAAATATTAAATAAATAAA 72771 

Qy 1984 gaatttttactactgcaaattcagaataagtgaatttgttttttagaaagattaaataag 2043 

I II II I II I Mil II II III II M Mill 

Db 72772 AATATTATAATTTATAAATAATAAAATATTAATATAAATTAATTAATAATATATAATAAA 72831 

Qy 2044 ttagtattacgattttt--agtttgatttggtggaaagtaatgtatgtttttgaacata 2100 
III III I I III II III III MM III III 

Db 72832 TTAATATAATTAAATTTAAAATATAAATTAATAAAAAAATAATACTAATATTAATATAAA 72891 

Qy 2101 attatttgacaataattaagttttctagggaataaacggaaatatcttcttcttttttgt 2160 

II I I I Mill III MM III II MM I II I I I 

Db 72892 ATAAAATAATAATAAATAAATTTTAATTAAAATTAAAT--AATAAAATATTAATATAAAT 72949 

Qy 2161 aaaattactaatgcaagaacaaacaacgttttggggagcaaataatctagctttaagtag 2220 

II I I MM I II III I I I I I II II I I 
Db 72950 TAATTAAATAATAATATAATAAATTAATTAAATAATAATATAATAAATTAATTAATTAAC 73009 

Qy 2221 tcagtgtaactctcaaaatctggtcataacttctaggc-tgagtttgctgtgctacag 2277 

I III I MM I MM I II I I I I I I II II 

Db 73010 AATATTTAAATAATTAAATATAATAATATATATTAAACAATTAATTATTATAAATTAAAG 73069 
Qy 2278 tagtaagtctatagaaacttacctgacaaaacgacatgacgtcagggtcgaatctacaac 2337 
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25 


149.8 


2.7 2426 


8 


SDU49822 


26 


149 


2.7 242893 


31 


CEY53C12 


27 


146 


2.6 15421 


33 


PFCOMPIRA 


28 


143.2 


2.6 106650 


39 


AC00770B 


29 


143 


2.6 145670 


50 


AC008132 


30 


142.8 


2.6 207957 


11 


AC004470 


31 


142 


2.6 176552 


39 


AC004617 


32 


141.8 


2.6 2426 


8 


SDU49822 


33 


141.4 


2.5 98734 


31 


PFMAL1P2 


34 


140.8 


2.5 14433 


34 


AE001369 


35 


140.4 


2.5 108908 


33 


PFMAL3P8 


36 


139.8 


2.5 175516 


60 


AC006280 


37 


139,2 


2.5 145670 


50 


AC008132 


38 


139 


2.5 14211 


34 


AE001368 


39 


139 


2.5 110000 


31 


PFMAL13P2J 


40 


138.4 


2.5 14001 


33 


PFCOMPIRB 


41 


137.6 


2.5 153098 


33 


PFMAL3P2 


42 


137.4 


2.5 176552 


39 


AC004617 


43 


137.4 


2.5 224448 


31 


PFMAL4P4 


44 


137.4 


2.5 298987 


58 


AE003846 


45 


136.8 


2,5 153418 


60 


AC004153 



049822 Saccharomyc 
Z92859 Caenorhabdi 
X95275 P.falciparu 
AC007708 Homo sapi 
AC008132 Homo sapi 
AC004470 Homo sapi 
AC004617 Homo sapi 
049822 Saccharomyc 
AL031745 Plasmodiu 
AE001369 Plasmodiu 
AL034560 Plasmodiu 
AC006280 Plasmodiu 
AC008132 Homo sapi 
AE001368 Plasmodiu 
Continuation (2 of 
X95276 P.falciparu 
AL034558 Plasmodiu 
AC004617 Homo sapi 
AL035477 Plasmodiu 
AE003846 Drosophil 
AC004153 Plasmodiu 



RESULT 1 

GBO34401 

LOCOS 

DEFINITION 
ACCESSION 
VERSION 
KEYWORDS 



GBU34401 1699 bp DNA PLN 01-JAN-1996 

Gossypium barbadense FbLate-2 gene, complete cds. 

034401 

034401. 1 61:1143223 
sea-island cotton. 



ORGANISM Gossypium barbadense 



AUTHORS 
TITLE 



AUTHORS 

TITLE 

JOURNAL 

FEATORES 
source 



mRNA 
gene 

CDS 



Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; 
euphyllophytes; Spermatophyta; Magnoliophyta; eudicotyledons; core 
eudicots; Rosidae; eurosids II; Malvales; Malvaceae; Gossypium. 

1 (bases 1 to 1699) 

Rinehart,J,, Petersen, M, and John,M,E. 

Tissue-specific and Developmental Regulation of Cotton mRNA, 

FbLate-2: Promoter Studies in Transgenic Plants 

Unpublished 

2 (bases 1 to 1699) 
John,M,E, 

Direct Submission 

Submitted (21-AUG-1995) Maliyakal E. John, Fiber Technology, 
Agracetus, 8520 University Green, Middleton, wi 53562, OSA 

Location/Qualifiers 

1. .1699 

/organism- "Gossypium barbadense" 

/strain-"Sea island" 

/db_xref-"taxon:3634" 

/clone-"FbL2-82A" 

369. .1585 

/gene-"FbLate-2" 

369. .1585 

/gene-"FbLate-2" 

379. .1380 

/gene-"FbLate-2" 

/codon_start-l 

/protein.id-"AAA84881.1" 

/dbjcref-"GI:1143224" 

/translation-'MIGSHTVSTAARRLFETOTTSSELPOLASRYEKQEESEYEKPEY 
KQPKYDEEYPKHEKPEIHKEEKQKPCKQHEEYHESHKSKEHEEYQKEKPEFPKLEKPK 



EKHEVEYPEIPEYKERQDEGKEHKHEECHKSHESREHEEYEKERPNFPKGEKPKEHEK 



VEYPRIPEYKERQDEGKEHKHEFQRHEKEEERRPEKRAEYSEWPKSMFTQSGSGTRP" 
polyA_signal 1448, .1454 

/gene-"FbLate-2" 
BASE COUNT 661 a 328 C 328 g 382 t 



Query Match 9.1%; Score 503; DB 7; Length 1699; 

Best Local Similarity 89.4%; Pred, No. 4.8e-49; 

Matches 589; Conservative 0; Mismatches 60; Indels 10; Gaps 

Qy 3858 tattactcgaactaaatgttgtcacaaattattatctaaataaagaa-aaacacttaat 3915 

MINI II I Mill I Mil! Illll III III II I Mil 
Db 1 TATTACCTGAGCCAAATGCTCTCACAAACTATTATCCAAAAAAAAAATGTTGAATATAAT 60 

Qy 3916 ttttataacattttttcatatatttgaaagattatattttgtatatttacgtaaaaatat 3975 

llllllllllllllllllllllllll lllllllllllllllllllllllllllllllll 
Db 61 TTTTATAACATTTTTTCATATATTTGCAAGATTATATTTTGTATATTTACGTAAAAATAT 120 

Qy 3976 ttgacatagattgagcaccttcttaacataatcccaccataagtcaagtatgtagatgag 4035 

lllllllllillll lllllllllllllllllllllllllllllllllllllllllllll 
Db 121 TTGACATAGATTGAACACCTTCTTAACATAATCCCACCATAAGTCAAGTATGTAGATGAG 180 

Qy 4036 aaattggtacaaacaacgtggggccaaatcccaccaaaccatctctcattctctcctata 4095 

lllllllllilllllllllllllllllllllllllllllllllllllll llllllllll 
Db 181 AAATTGGTACAAACAACGTGGGGCCAAATCCCACCAAACCATCTCTCATCCTCTCCTATA 240 

Qy 4096 aaaggcttgctacacatagacaacaatccacacacaaatacac gttcttttcttt 4150 

Minn i mini iiiimiiimiiiiiiiiiii nun in 

Db 241 AAAGGCTAGTTACACATACACAACAATCCACACACAAATACACTCAAAATTCTTTGCTTT 300 

Qy 4151 ctattt-gattaaccatggctcatagcattcgtcaccctttcttccttttccaactttta 4209 

Db 301 GTATTTCGGTTAACCATGGCTCAT 360 ' 

Qy 4210 ctcataagtgtctcactagtgaccggtagccacactgtttcggcagcggctcgacgttta 4269 

IIIIMIIIIIIIIMI III Illll II II III II III IIMII.IIIIIIIIIII 
Db 361 CTCATTAGTGTCTCACTAATGATCGGTAGCCACACCGTCTCGACAGCGGCTCGACGTTTA 420 

Qy 4270 ttcgagacacaagcaacctcatcagagctcccacaattggcttcaaaatacgaaagcacg 4329 

llllllllllll llllllllll III I llllllll llllllllllllllll I 
Db 421 TTCGAGACACAAACAACCTCATCGGASTTGCCACAATTAGCTTCAAAATACGAAAAGCAG 480 

Qy 4330 --agagtctgaatacgaaaagccagaatacaaacagccaaagtatcacgaagagtactca 4387 

llllllllllll llllllll 111111111111111111111 lllllllllll 41 
Db 481 GAAGAGTCTGAATATGAAAAGCCGGAATACAAACAGCCAAAGTATGACGAAGAGTACCCA 540 

Qy 4388 aaacttgagaagcctgaaatgcaaaaggaggaaaaacaaaaaccctgcaaacagcatgaa 4447 

mi minimum n inmnnmnnni inn n mm 

Db 541 AAACATGAGAAGCCTGAAATTCACAAGGAGGAAAAACAAAAACCGTGCAAGCAACATGAA 600 
Qy 4448 gagtaccacgagtcacacgaatcaaaggagcaaaaagagtacgagaaagaaaatctcga 4506 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 mi imiiii iniim minim i m 

Db 601 GAGTACCACGAGTCACACAAATCGAAGGAGCACGAAGAGTACCAGAAAGAAAAACCCGA 659 



RESULT 2 
118362 

LOCUS 118362 1283 bp DNA PAT 07-OCT- 

DEFINITION Sequence 17 from patent US 5495070. 

ACCESSION 118362 

VERSION . 118362.1 GI: 1598717 

KEYWORDS 

SOURCE Unknovn , 
ORGANISM Unknown. 

Unclassified. 
1 (bases 1 to 1283) 
John,M. 

TITLE Genetically engineering cotton plants for altered fiber 
JOURNAL Patent: OS 5495070-A 17 27 -FEB-1996 ; 
FEATORES Location/Qualifiers 
source 1, .1283 

/organism-"unknown" 
509 a 233 c 251 g 290 t 



BASE COUNT 
ORIGIN 



Query Match 



4.8%; Score 265.4; DB 5; Length 1283; 
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FEATURES 

source 



BASE COUNT 
ORIGIN 



■ Web : www.genoscope.cns.fr) 

Determination of this BAC-end sequence was carried out as part of a 
collaboration with the Berkeley Drosophila Genome Project (BDGP) . 
The BDGP is constructing a physical map of the Drosophila 
melanogaster genome using these BACs, For further information 
please see http://www.fruitfly.org The BDGP Drosophila 
melanogaster BAC library was prepared by Kazutoyo Osoegawa and 
Aaron Mammoser in Pieter de Jong's laboratory in the Department of 
Cancer Genetics at the Roswell Park Cancer Institute in Buffalo, 
NY. The library is named RPCI-98 and was constructed by partial 
EcoRI digestion of Drosophila DNA provided by the BDGP from the 
isogenic strain y2; cn bw sp, the same strain used for the BDGP's 
PI and EST libraries . A more detailed description of the library 
and how to order individual BAC clones, the entire library, or 
filters for hybridization from the BACPAC Resource Center can be 
found at http://bacpacmed.buffalo.edu/drosophilaJjac.htii. 

Location/Qualifiers 

1. .1101 

/organism- "Drosophila melanogaster " 
/db_xref-'taxon:7227" 



/clone_lib-"RPCI-9E 
/clone-"BACR08K08" 
/note- "end : TET3" 
a 120 c 103 g 



334 t 149 others 



Query Match 2.1%; Score 113.2; DB 122; Length 1101; 

Best Local Similarity 42.9%; Pred. No. S.9e-08; 

Matches 268; Conservative 74; Mismatches 272; Indels 10; Gaps 

Qy 3323 aattaactttaaattacaagcataatattaaattttgaatcaattaatttttatttctat 3382 

II II :|llll I =1 I I I I I : II :| :|| I II 

Db 488 AAAAMTOTTAAAAAAAWAAAAACCTTTAATAAAKAAAAAAMAAWAAAAAAAAAAAWTTT 547 

Qy 3383 tattttaattaatttagtctattttttcaaaataaaatttaaatctaaataaaaataatt 3442 

I I:: : II II I I MM MM III III lllh I II 
Db 548 TTTWWAMTTTTATAMTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAWMATTT 607 

Qy 3443 tttccttaatgttgaaacaactcatgttatacttcaaaattataagtattatatttacct 3502 

III II I II I :| II : I : I M II Mil I 

Db 608 TTTTTTTTTTTTTTTTTTTTTTAWTTTTTWTATNTTTWTTAATTTTTAAATAAWTTTTAT 667 

Qy 3503 tgatgatttatttattagtatattaattctgattataattatggtgggatacaatcgctt 3562 

I I: I MM I I I I II I Mill I III 1:1 
Db 668 TTAWNAMWATTAAAAAAAAAAAAAAAAAAAAATATAAAAAAAAAATTATAAAWTWTAAA 727 

Qy 3563 tccactaaatattttaactatgatttataaatttatttcaacatcgtatatttacttatt 3622 

I MM: II Ml III I I III: 1 1 : 1 1 II II I h 
Db 728 TATA - TAWTWTTTAWAMTATATTAAAAAAAWATAWTTTTATTTAT ATTAAAAATATWAT 786 

Qy 3623 aatacataatttatcataattttatggaaattgagaccaagaaacattaagagaacaaat 3682 

Ml III I : 111:111; I II II I I I I :: I I I I |: 
Db 787 WATTTATATATNWNWATAWTTTWTTTAAATTTWATAWTTAAATWTMATTAAAWATTNTAW 846 

Qy 3683 tctataacaaagacaatttagaaaaaaatgtacttttaggtaattttaagtactcttaac 3742 

M M Ml I III MM III II |:h ::|||:|! I : |: 
Db 847 AATAAAMWAAAAAAATAATAAAAATAATWTA- -TWTWATAWWTTTWAAATAAWTAAAWT 904 

Qy 3743 caaacacaaaaattcaaatcaaatgaactaaataagataatataacatacggaacatctt 3802 

IM I IMMII IMI III :| : IM : |: : :| II : I 
Db ,905 AAWAAAAAAAAATTAAWATAAAAATWAAWTAWTTWTTTMTYTAWA --AAWYWATT 957 

Qy 3803 acttgtaatcttacattcccataattttattatgaaaaataatcttatattactcgaact 3862 

I : Ml II | MMM II IM Mill II I I MM ;| 

Db 958 AWAMAWAATATTTTTATATWATWATTATAATAWWAAAAAAAAAATAAAATWAAATNWAWA 1017 

Qy 3863 aaatgttgtcacaaattattatctaaataaagaaaaacacttaatttttataacattttt 3922 

"I: MIMI IMM |: MM: II 1 1 1 : 1 Ml II I: II I: : 
Db 1018 WWAYAAAWAYACWAAAMWTMTATWYATAWMAAAAAAMAAHTANATTATWAAAWAWMAHM 1077 

Qy 3923 tcatatatttgaaagattatattt 3946 

. : : Mil : :: I III 



Db 1078 WAWCWTATCCNCWYCWYATTTTTT 1101 



RESULT 13 
CNS009G1/C 
LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 
ORGANISM 



AUTHORS 

TITLE 

JOURNAL 



FEATURES 

source 



BASE COUNT 
ORIGIN 



CNS009G1 876 bp DNA GSS 03-JUN-1999 

Drosophila melanogaster genome survey sequence TET3 end of BAC f 
BACR19J14 of RPCI-98 library from Drosophila melanogaster (fruit 
fly), genomic survey sequence. 
AL053529 

AL053529.1 61:4935018 
GSS, 

fruit fly. 

Drosophila melanogaster 

Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 
Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; 
Muscomorpha; Ephydroidea; Drosophilidae; Drosophila. 
1 (bases 1 to 876) 
Genoscope. 
Direct Submission 

Submitted (02-JUN-1999) Genoscope • Centre National de Sequencage : 
BP 191 91006 EVRY cedex - FRANCE (E-mail : seqrefJgenoscope.cns.fr 
• Web : www.genoscope.cns.fr) 

Determination of this BAC-end sequence was carried out as part of a 
collaboration with the Berkeley Drosophila Genome Project (BDGP). 
The BDGP is constructing a physical map of the Drosophila 
melanogaster genome using these BACs. For further information 
please see http://www.fruitfly.org The BDGP Drosophila 
melanogaster BAC library was prepared by Kazutoyo Osoegawa and 
Aaron Mammoser in Pieter de Jong's laboratory in the Department of 
Cancer Genetics at the Roswell Park Cancer Institute in Buffalo, 
NY. The library is named RPCI-98 and was constructed by partial 
EcoRI digestion of Drosophila DNA provided by the BDGP from the 
isogenic strain y2; cn bw sp, the same strain used for the BDGP's 
PI and EST libraries. A more detailed description of the library 
and how to order individual BAC clones, the entire library, or 
filters for hybridization from the BACPAC Resource Center can be 
found at http://bacpac.med. buff alo.edu/drosophila_bac. htm. 

Location/Qualifiers 

1. .876 

/organism- "Drosophila melanogaster" 
/dbjcref-"taxon:7227" 
/clone_lib-"RPCI-98" 
/clone-"BACR19Jl4" 
/note- "end : TET3" 
335 a 54 c 57 g 325 t 105 others 



Query Match 2.0%; Score 112,6; DB 122; Length 876; 

Best Local Similarity 42.4%; Pred. No. 7,4e-08; 

Matches 229; Conservative 54; Mismatches 255; Indels 2; Gaps 

Qy 4901 tttaaactttttacaagttaagacatgtataaatatatgacaatataattacaagtttta 4960 

Mill :: I III I I MMIM I I MIM I I III 
Db 875 TATATAATANAWWATATNTAATAAAAAATATWATATATAATATATAWATWATATATATTA 816 

Qy 4961 gttcaatgttagctatcttagtatgttattgatgatcttaattacatttaaacaaattcc 5020 

I I II III III II MM I : Ml I I: M I I: 
Db 815 "TATANTATATATATATTATAATWATAATTAATANWWAWATAAWTTWAWTAAANTTWTW 758 

Qy 5021 acttaaaattttaataaataataacaaataattattgtaatataatacattaaatgcaac 5080 

I: I: I Mil III M IMI Ml I MM I II II I 
Db 757 NTATWTAWNTATAATATATATWATNTAWTATATATTWTTNTATAAAAATATATATATATA 698 

Qy 5081 aaaaaatgaaataaataaaataaaatagcaaataattgttataatattgtaatataatat 5140 

: II II M : : 1 1 1 : III : MM II: Ml :: MM I 
Db 697 WNAATATTTWAAWWATATTWAAAATATTATWNTAAWATATAWTATATATWWTWATATAAA 638 

Qy 5141 gtaccatattcttaactgaaatagggtctaacctataatccctaaaatttcagtttaaat 5200 

I II I II I :: I I III IM II : MM I I 

Db 637 TTNTATTAATTTTTATWTTWTTTATATATAAATTWTTATATAAWTTAWWTAAWTTTWAWT 578 
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BASE COUNT 
ORIGIN 



Location/Qualifiers 
1. .1187 

/organism- "Arabidopsis thaliana" 

/strain- "Columbia" 

/dbjcref-"taxon:3702" 

/clone-"F19C22" 

/clone_lib-"IGF" 

/sex-"hermaphrodite" 

/note- "Vector: BeloBACII; Site J: EcoRI; Site J: EcoRI; 
Produced by Thomas Altmann" 
385 a 51 c 60 g 594 t 97 others 



Query Match 2,1%; Score 115.2; DB 120; Length 1187; 

Best Local Similarity 47.0»; Pred. No. 2.9e-08; 

Matches 362; Conservative 0; Mismatches 402; Indels 6; Gaps 

Qy 3200 ttatattacggaatgtaatattatattttaaaataaaattatgttatttagattcttaat 3259 

I Mill! I Mil II I Mill II I III I II I II I 
Db 1185 TAATATTATTTTTTATAATTATAAAAATTAAATTATATATATATAATAAAAATAAAATTT 1126 

Qy 3260 attttggagcattccatactataatttcgtaacataatattaaaatatagtaatataaag 3319 

1 1 I II III II III II III II lllll! II III 
Db 1125 TAITANAATAATATAATAATAATAATTATTTATATATAATATAAATATTAAAAATAAAAT 1066 

Qy 3320 tgtaattaactttaaattacaagcataatattaaattttgaatcaattaatttttatttc 3379 

I lllll I Mil II II I III III I I II III III 
Db _ 1065 TATAATTTAAANTAAAATATAATTAATATAATAATATAAAATATAAATAAAAATTAAAAT 1006 

Qy 3380 tattattttaattaatttagtctattttttcaaaataaaatttaaatctaaataaaaata 3439 

iii mi ii ii i i i iii i mi linn 

Db 1005 TATAATTTAAAATATTAATATTATAAAATATTTTTATTAATANTATATTAAAAAAAAATT 946 
Qy 3440 atttttccttaatgttgaaacaactcatgttatacttcaaaattataagtattatattta 3499 ■ 

mi nn i i ii i i i i inn n i 

Db 945 TATTTTIATTAAAAAAAANAANNAAATATTTTTTATAAATATTTAAAATATAAATTAAAA 886 

Qy 3500 ccttgatgatttatttattagtatattaattctgattataattatggtgggatacaatcg 3559 

I II II II II llll II I II II II II 

Db 885 AAAATTTTTTTAATATAATAATATAAAAAAATATAAANAAAATAANAAAATAAAATATAT 826 
Qy 3560 ctttccactaaatattttaactatgatttataaatttatttcaacatcgta-tatttac 3617 

ii i i ii ii i ii iiiii n mill ii i ii in 

Db 825 TTTAAAAAATTAAATATTTAAAATAATTTAAAANATTATTTAAAATTAATATTTATNANA 766 

Qy 3618 ttattaatacataatttatcataattttatggaaattgagaccaagaaacattaagagaa 3677 

I II I llll llll II I II II II I III I 
Db 765 ANAAAAAAAAATAAATTATTATTAATTAATAAAATAATTTATATTAATAAAAATIAATNT 706 

Qy 3678 caaattctataacaaagacaatttagaaaaaaatgtacttttaggtaattttaagtactc 3737 

lllll III I II I I I III II I I I III 
Db 705 TAAATTTTATTATTAAAAATNTNAAATAAATAAAAAAAAAATTAAAATANAATAAAAATT 646 

Qy 3738 ttaaccaaacacaaaaattcaaatcaaatgaactaaataagataatataacatacggaac 3797 

ii ii i mini in i n i i in m ii i ii ii 

Db 645 TTTTTTAATTAIAAAAATTAAAAAAATATATAATTTATATCTTTANATTAAATTTAAAAT 586 

Qy 3798 atcttacttgtaatcttacattcccataattttattatgaaaaataatcttatattactc 3857 

I II II II lllll III III II I I II II I 
Db 585 AAATTTATTAAAA — ACATTAAAATATTTTATAAATAAIATTAAAAAATAATAATATN 530 

Qy 3858 gaactaaatgttgtcacaaattattatctaaataaagaaaaacacttaatttttataaca 3917 

I II II I I llll III III I II II I llll 
Db 529 AATTANAAAATTAATAAACATTAATATTTAATATNAAAATAATAANNTAAAATATAANNA 470 

Qy 3918 ttttttcatatatttgaaagattatattttgtatatttacgtaaaaatat 3967 

I I II II II I I I III lllll II 
Db 469 AAAAATAAAATTTTANAANNANNAAANNANAANNATAAAAAAAAAAAAAT 420 



RESULT 10 
CNS0021J 



LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 
ORGANISM 



AUTHORS 

TITLE 

JOURNAL 



FEATURES 

source 



BASE COUNT 
ORIGIN 



CNS0021J 1101 bp DNA GSS 03-JUN-1999 

Drosophila melanogaster genome survey sequence TET3 end of BAC * 
BACR05N11 of RPCI-98 library from Drosophila melanogaster (fruit 
fly), genomic survey sequence. 
AL061936 

AL061936.1 GI;4940214 
GSS. 

fruit fly. 

Drosophila melanogaster 

Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 
Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; 
Muscomorpha; Ephydroidea; Drosophilidae; Drosophila. 
1 (bases 1 to 1101) 
Genoscope. 
Direct Submission 

Submitted (02-JUN-1999) Genoscope ■ Centre National de Sequencage : 
BP 191 91006 EVRY cedex * FRANCE (E-mail : seqref8genoscope.cns.fr 
- Web : www.genoscope.cns.fr) 

Determination of this BAC-end sequence was carried out as part of a 
collaboration with the Berkeley Drosophila Genome Project (BDGP). 
The BDGP is constructing a physical map of the Drosophila 
melanogaster genome using these BACs. For further information 
please see http://www.fruitfly.org The BDGP Drosophila 
melanogaster BAC library was prepared by Kazutoyo Osoegawa and 
Aaron Mammoser in Pieter de Jong's laboratory in the Department of 
Cancer Genetics at the Roswell Park Cancer Institute in Buffalo, 
NY. The library is named RPCI-98 and was constructed by partial 
EcoRI digestion of Drosophila DNA provided by the BDGP from the 
isogenic strain y2; cn bw sp, the same strain used for the BDGP's 
PI and EST libraries. A more detailed description of the library 
and how to order individual BAC clones, the entire library, or 
filters for hybridization from the BACPAC Resource Center can be 
found at http://bacpac.med. buffalo.edu/drosophilajDac, htm. 

Location/Qualifiers 

1. .1101 

/organlsm-'Drosophila melanogaster" '■■ 

/db xref="taxon:7227" 
/clone.lib-"RPCI-98" 
/clone-"BACR05Nll" 

/note-"end : TET3" 

631 a 7 .c 28 g 289 t 146 others 



Query Match 2.1%; Score 115; DB 122; Length 1101; 

Best Local Similarity 41,3%; Pred. No. 3.2e-08; 

Matches 324; Conservative 63; Mismatches 396; Indels 1; Gaps 

Qy 3119 acagaataatgttaaaatgaaattaaaataaggtggcctggtcacacacacaaaaaaaaa 3178 

I I II II llll III llll II I I III 1 1 1 M 1 1 1 1 

Db 270 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAMAAAAAAACAAAAAAAAAAA 329 

Qy 3179 ctaatgttggttggttgaattttatattacggaatgtaatattatattttaaaataaaat 3238 

II II I I I II II I I I llll llll 

Db 330 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 389 

Qy 3239 tatgttatttagattcttaatattttggagcattccatactataatttcgtaacataata 3298 

I III III II Mill II I II I 

Db 390 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 449 

Qy 3299 ttaaaatatagtaatataaagtgtaattaactttaaattacaagcataatattaaatttt 3358 

llll I I II I I! II II III III llll III 
Db 450 AAAAAAAAAAAAAAAAANAANAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAANAAN 509 

Qy 3359 gaatcaattaatttttatttctattattttaattaatttagtctattttttcaaaataaa 3418 

II II I II :| III : I III : II III I I lllhl II I II 

Db 510 AAAAATATAATTTAWTTTTTTWTTAATTAWTTTTTTTTTTTTTTTTTTTWTTAATTTTAA 569 

Qy 3419 atttaaatctaaataaaaataatttttccttaatgttgaaacaactcatgttatacttca 3478 

III II: lllll II II : :| IIM II I : II : I I : II I 

Db 570 TTTTTAAWAWAAATTTAATAAAAWAWTWTTTAWTTTTAATWTAAWWAAAAAAAAWTTTTA 629 

Qy 3479 aaattataagtattatatttaccttgatgatttatttattagtatattaattctgattat 3538 
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Pl and EST libraries, A more detailed description of the library 
and how to order individual BAC clones, the entire library, or 
filters for hybridization from the bacpac Resource Center can be 
found at http://bacpac.raed.buffalo.edu/drosophila_bachtni. 
FEATURES Location/Qualifiers 
source 1, ,1101 

/organism-'Drosophila melanogaster" 
/db_xref-"taxon:7227" 
/clone_lib-"RPCI-98" 
/clone-"BACR29P01" 
/note-*end : TET3" 
BASE COUNT 366 a 66 c 104 g 351 t 214 others 



Query Match 2.1%; Score 118; DB 122; Length 1101; 

Best Local Similarity 40.7%; Pred. No. l.le-08; 

Matches 234; Conservative 81; Mismatches 258; Indels 2; Gaps 1; 

Qy 4697 atagaaattctaaatggttatagtttatgttatagtgtatgttgtagtgaaaktaatttt 4756 

M ::| II II |::| lllll I I I :| I I II ::| III 
Db 969 AYATTTWKTATACATAATWWTTTTTTATATACAATTTWAAAATAAAAAWTAACWWAATTT 910 

Qy 4757 aaatgttgtatctaatgttaacatcacttggcttgatttatgttatgttatgtattttac 4816 

hi : I : I I II I I : :|| :: I : II :|| 
Db 909 AWAAAOTTWTTTTTMWWTAATTTWATTACAWTWTTAWWTAAAAAAATWTTAAAWTAA 850 

Qy 4817 tttaatgatattgcatgtattgttaatttaacattgcttgatcattatactcttctacta 4876 

: II ::IM I I : hi |: I h: I II llh: Mill 
Db . 849 AWAAAAMWATTAWAATTTAYATKATATWTAAAWWTATAWATTATTMW--WAATWTTATA 792 

Qy 4877 ttaattataaatggcactgttttgtttaaactttttacaagttaagacatgtataaatat 4936 

:MM : I : :lh h : I Nil I : Mil I I I III 
Db 791 WNATTTTTTTTWTAWAAWTWTTWATWAWTAATTTTAAWTWWTTAAATAAAAAAMTTTAT 732 

Qy 4937 atgacaatataattacaagttttagttcaatgttagctatcttagtatgttattgatgat 4996 

I hi III: II lllh II :| |:: I I : I :|:|| I |: 
Db 731 TTTTATTTWTTATTWAAAWTTTTWTTTTWAATTMTTTAATAWTTAAAWTWTTAAAWAW 672 

Qy 4997 cttaattacatttaaacaaattccacttaaaattttaataaataataacaaataattatt 5056 

I hill II I : I I I : 1 : 1 1 : 1 I III hh: h: II II 
Db 671 WTAAWTTAAATAAATWTATTTAAAATWTWAAWTATAAATTTAMADWTTATAWWTTTTTTT 612 

Qy 5057 gtaatataatacattaaatgcaacaaaaaatgaaataaataaaataaaatagcaaataat 5116 

: :| :MII II llll II I III! : I III :::| II I :| ;:; I 
Db 611 WCWTTWAATATATAAAATWAAAAATTAAATTWTTTTTATATWWWATAAAAAMATWWWTT 552 

Qy 5117 tgttataatattgtaatataatatgtaccatattcttaactgaaatagggtctaacctat 5176 

lllllh: II I :h II llll II II I I I 
Db 551 ATTTATAAWWAATTATTTAWAWTTGATTTTTATTTATTTWTTTTTTAAAWTTTTATTATA 492 

Qy 5177 aatccctaaaatttcagtttaaatatttttatacctgccatattattagaactcttttta 5236 

: 111:1 I || I lllllll II I: MM I |::|||: 
Db 491 TWATTTATTAATWTTAWTTATYTTTTTTTTATATCTTTCMAAWTATTTMTCCCCYYTTTW 432 

Qy 5237 aatatattaaaattttaattataccaatttaattt 5271 

II II llll : I I I II II 
Db 431 TTMATGTTTTTTTTTTTHCTCTTNCNTTTNTTTNT 397 



RESULT 7 
B11102 

LOCUS B11102 1187 bp DNA GSS 14 -MAY- 1997 

DEFINITION F19C22-T7 IGF Arabidopsis thaliana genomic clone F19C22, 

genomic survey sequence, 
ACCESSION B11102 

VERSION B11102.1 GI: 2092386 
KEYWORDS GSS. 
SOURCE thale cress. 
ORGANISM Arabidopsis thaliana 

Eukaryota; Viridiplantae; Embryophyta; Tracheophyta; Spermatophyta; 

Magnoliophyta; eudicotyledons; Rosidae; eurosids II; Brassicales; 



REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
COMMENT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



Brassicaceae; Arabidopsis, 
1 (bases 1 to 1187) 

Feng,J., Dewar,K., Buehler,E. ( Kim,C, Li,Y., Shinn,P,, Sun,H. and 
Ecker,j. 

BAC End Sequences at ATGC 
Unpublished (1997) 

On Sep 10, 1998 this sequence version replaced gi:3556525. 
Other.GSSs: F19C22-Sp6 
Contact: Ecker J. 

Arabidopsis Thaliana Genome Center 
University of Pennsylvania 

Dept. of Biology, University of Pennsylvania, Philadelphia, PA 
19104 

Tel: 215-898-9384 
Fax: 215-898-8780 

Email : j ecker Satgenome . bio , upenn . edu 
Seq primer: T7 
Class: BAC ends 

High quality sequence start: 72 
High quality sequence stop; 353. 

Location/Qualifiers 

1, .1187 

/organism" "Arabidopsis thaliana" 
/strain- "Columbia" 

/db_xref-"taxon:3702" : 

/clone-"Fl9C22" 

/clone_lib-"IGF" 

/sex-"hermaphrodite" 

/note- "Vector: BeloBACII; Site_l: EcoRI; SiteJ: EcoRI; 
Produced by Thomas Altmann" 
385 a 51 c 60 g 594 t 97 others 



Query Match 2.14; Score 117; DB 120; Length 1187; .. 

Best Local Similarity 43.6%; Pred. No. 1.6e-08; 

Matches 365; Conservative 0; Mismatches 470; Indels 3; Gaps 

Qy 3121 agaataatgttaaaatgaaattaaaataaggtggcctggtcacacacacaaaaaaaaact 3180 

II II II lllllll 

Db 328 AAANNNNhMNNNNNNNMNNNNNNANNNNNNNNMNNTNNNNNNTNNNAANAAAAAMT 387 

Qy 3181 aatgttggttggttgaattttatattacggaatgtaatattatattttaaaataaaatta 3240 

I II I III I I II I II I II 

Db 388 ATNNNNTNTTTTTNNNNNTNTNNNTNNNNTTTATTTTTTTTTTTTTATNNTTNTNNTTTN 447 

Qy 3241 tgttatttagattcttaatattttggagcattccatactataatttcgtaacataatatt 3300 

I III I III I III III I III llll I Mill 
Db 448 NTNNTTNTAAAATTTTATTTTTTNNTTATATTTTANNTTATTATTTTNATATTAAATATT 507 

Qy 3301 aaaatatagtaatataaagtgtaattaactttaaattacaagcataatattaaattttga 3360 

II I II llll I I II II II I III llll II 

Db 508 AATGTTTATTAATTTTNTAATTNATATTATTATTTTTTAATATTATTTATAAAATATTTT 567 

Qy 3361 atcaattaatttttatttctattattttaattaatttagtctattttttcaaaataaaat 3420 

I II I llll I III II I I III Ml I I 

Db 568 AATGTTTTTAATAAATTTATTTTAAATTTAATNTAAAGATATAAATTATATATTTTTTTA 627 

Qy 3421 ttaaatctaaataaaaataatttttccttaatgttgaaacaactcatgttatacttcaaa 3480 

i i iii nun iniin i n ii ii i i i ii ii 1 1 

Db 628 ATTTTIATAATTAAAAAAAATTTTTATTNTAT-TTTAATTTTTTTTTTATTTATTTNANA 686 
Qy 3481 attataagtattatatttaccttgatgatttatttattagtatattaattctgattataa 3540 

ii in ii i iiiii i i nun ii ii i i i iiii i 

Db 687 TTTTTAATAATAAAATTTAANATTAATTTTTATTAATATAAATTATTTTATTAATTAATA 746 

Qy 3541 ttatggtgggatacaatcgctttccactaaatattttaactatgatttataaatttattt 3600 

III I I I I I II III I II llllll III 
Db 747 ATAATTTATTTTTTTTTNTTNTNATAAATATTAATTTTAAATAATNTTTTAAATTATTTT 806 

Qy 3601 caacatcgtatatttacttattaatacataatttatcataattttatggaaattgagacc 3660 

II II II III II II II I I Mill II I I 

Db 807 AAATATTTAATTTTTTAAAATATATTTTATTTTNTTATTTTNTTTATATTTTTTTATATT 866 
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found at http://bacpac.med.buffalo.edu/drosophilaJ3achtn1, 
FEATURES Location/Qualifiers 
source 1. ,1101 

/organ ism- "Drosophila melanogaster" 
/db_xref-"taxon:7227" 
/clone_lib-"RPCI-98" 
/clone-"BACRQ8K10" 
/note-'end : TET3" 
201 a 64 c 131 g 202 t 503 others 



BASE COUNT 
ORIGIN 



Query Match 2.3%; Score 126.2; DB 122; Length 1101; 

Best Local Similarity 21.5%; Pred. No. 6.9e-10; 

Matches 151; Conservative 310; Mismatches 238; Indels 4; Gaps 2; 

Qy 3291 acataatattaaaatatagtaatataaagtgtaattaactttaaattacaagcataatat 3350 

I hi::::: | | :|| : ||| | |: : ||| | :| |:|| |; 
Db 398 ATAWAWWWWWTTTTTTTTAWAAAWAAAATAATTWWAAWAWAAAAAATTWWAAAAWAAAAW 457 

Qy 3351 taaattttgaatcaattaatttttatttctattattttaattaatttagtctattttttc 3410 

: ::| hi II II III: | II 1 : 1 1 |:| : | : | :::|| 
Db 458 AWTAmTTAWTWAAAAAAAAmTTWTTTTTTTWTTAWTTWATAWWMTAAAW 517 

Qy 3411 aaaataaaatttaaatctaaataaaaataatttttccttaatgttgaaacaactcatgtt 3470 

MM Mil :lll: MINI :|::|::||: |:||: I |: :|| :::| I 
Db 518 AAAAAAAAAAAWAMWAAAWATAAATWTWWTWWTTYTTWAAWATAAAMCMAAWYYHTYTT 577 

Qy 3471 at-acttcaaaattataagtattatatttaccttgatgatttatttattagtatattaat 3529 

I :::|:: |:: : | : | : | : 1 1 : : : : : : ;::;:|:::| : ::::|| 
Db 578 YTYHYYTYWOTMTWHYHTMYTHAWAHTTWYHWYHTYAMW^ 637 

Qy 3530 tctgattataattatggtgggatacaatcgctttccactaaatattttaactatgattta 3589 

:: :| :: : |:| |:: :|:: :|:::| ::::::: | : : |: 
Db 638 AYYYYYTCMYYYHYMHWBHAB^HAMMTHTWOTHAYHWATYHYYYYMYCAMMCMCTHT 697 

Qy 3590 taaatttatttcaacatcgtatatttacttattaatacataatttatcataattttatgg 3649 

:: ::: ::: |:: : :::|:::: :::: ::: || |:: ::::: :: ::| 
Db 698 CHHCYYYYHHYTAHHTHTHHWYAHYYMWYYMWAYYWMYCTACTYHYHHHHHYHWAYHTTW 757 

Qy 3650 aaattgagaccaagaaacattaagagaacaaattctataacaaagacaatttagaaaaaa 3709 

|: : : : ::| II I :|| | :: ::||::::: :: : : :: : :: 
Db 758 YAWAHAMWMWHHAHYAAAAAWAAWATTHHYHHTTHYMHHTYMYHYYMYTCCYMCTYHCWH 817 

Qy 3710 atgtacttttaggtaattttaagtactcttaaccaaacacaaaaattcaaatcaaatgaa 3769 

: ||!| :|: ::::|:|:: :: : :: ::|: :::::: :| | :|| : 
Db 818 YYHTAYTCWTWTHHWMWTWTHWYHHTWWHHTTTHWAWWHTHTWCWWWWHATTWTWATHCW 877 

Qy 3770 ctaaataagataatataacatacggaacatcttacttgtaatcttacattcccataattt 3829 

Db 878 ACMTMHWHHWIHMHHHHMACHMHHTHMCMCffl 937 



Qy 3830 tattatgaaaaataatcttatattactcgaactaaatgttgtcacaaattatta--tct 3886 

Db 938 HHmTMTMTTMMMMCCMMHHHCHMYHMMHMYMYCCHYYCTCHTHATTHYHYMCTCY 997 

Qy 3887 aaataaagaaaaacacttaatttttataacattttttcatatatttgaaagattatattt 3946 

: |:: : I : I I :| || |:::::::: :: :::::::: |: | | :::|: 

Db 998 HtCTWHTYWTAYWWAWTAHAMTTATWWWWMHWAMTWWWWWWWWATAMCO 1057 

Qy 3947 tgtatatttacgtaaaaatatttgacatagattgagcaccttc 3989 

Db 1058 YHTHCTWYYHHTYHMWWAmWMHMAHYHWAHHCWYnM 1100 



RESULT 4 
CNS0167M/C 

LOCUS CNS0167M 1201 bp DNA GSS 26-JUL-1999 

DEFINITION Drosophila melanogaster genome survey sequence T7 end of BAC 

BACN15M24 of DrosBAC library from Drosophila melanogaster (fruit ■ 

fly), genomic survey sequence. 
ACCESSION AL106396 
VERSION AL106396.1 GI: 5621701 



KEYWORDS 
SOURCE 
ORGANISM 



AUTHORS 

TITLE 

JOURNAL 



FEATURES 
source 



BASE COUNT 
ORIGIN 



GSS. 

fruit fly, 

Drosophila melanogaster 

Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 

Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; 

Muscomorpha; Ephydroidea; Drosophilidae; Drosophila. 

1 (bases 1 to 1201) 

Genoscope. 

Direct Submission 

Submitted (23-JUL-1999) Genoscope ■ Centre National de Sequencage : 
BP 191 91006 EVRY cedex • FRANCE (E-mail : seqrefSgenoscope.cns.fr 
■ Web : www.genoscope.cns.fr) 

Determination of this BAC-end sequence was carried out as part of a 
collaboration with the European Drosophila Genome Project (EDGP) - 
http://www.edgp.ebi.ac.uk -. This Drosophila melanogaster BAC 
library (Dros BAC) was made by Alain Billaud at CEPH (Centre 
d' Etude du Polymorphism Humain) with funding provided by a MRC 
project grant, The DNA was prepared from embryos by Alain Bucheton 
and Genevieve Payan, It has been constructed in the vector 
pBeloBACll. 

Location/Qualifiers 

1. .1201 

/organism- ' Drosophila melanogaster ' 
/plasmid-'pBeloBACir 
/db_xref-"taxon:7227" 
/clonejib" "DrosBAC" 
/clone-"BACN15M24 n 
/note- "end : T7" 
323 a 87 c 79 g 551 t 161 others 



Query Match 2.2%; Score 121.6; DB 123; Length 1201; 

Best Local Similarity 37.4%; Pred. no. 3.3e-09; 

Matches 280; Conservative 101; Mismatches 368; Indels 0; Gaps 

Qy 3211 aatgtaatattatattttaaaataaaattatgttatttagattcttaatattttggagca 3270 

Ml :h MM I III I II II: : II II h M I I I 
Db 1193 WATAWAWATATATAAATATAAAAATAAATAWAWAATATATAWNAAAAANTATAAAAAAWA 1134 

Qy 3271 ttccatactataatttcgtaacataatattaaaatatagtaatataaagtgtaattaact 3330 

I:: Mill : II :::|IIMIII : I I :IM M II 
Db 1133 AMTAWWATATAMWMWAMWMMTATTAMTAWAATATAANWAAAAAAAAAAAAAAA 1074 

Qy 3331 ttaaattacaagcataatattaaattttgaatcaattaatttttatttctattattttaa 3390 

::: II: I Ml : I I III II Mill III :|: M 

Db 1073 AWWWTTTHTANAATATTTWTTTWNTATAWAATWTTTTTTTTTTTTTTTTATAWAWAAAWA 1014 

Qy 3391 ttaatttagtctattttttcaaaataaaatttaaatctaaataaaaataatttttcctta 3450 

'II I I II I h I 1:11111111 I : hM lh 
Db 1013 AAAAAAAATTTTAAAAATAAAWTAATTATWAAAATTTTTAAAAATTTTTWATWWTTTTTW 954 

Qy 3451 atgttgaaacaactcatgttatacttcaaaattataagtattatatttaccttgatgatt 3510 

I III :| I : :| II I MIMI :h M III III 
Db 953 AAAAAAAAAAWAATATWAAAWTTTTTTTTATWTATAAAWAWAWTTTTTTYWTTAAAAAAA 894 

Qy 3511 tatttattagtatattaattctgattataattatggtgggatacaatcgctttccactaa 3570 

1:11:1111 MM : I I II :: I : 
Db 893 AAAWTTAAAAATWTAAAAATTATAAAATAAAAWAAAAAAAAAAAAAAAAAWWAAAAAWTT 834 

Qy 3571 atattttaactatgatttataaatttatttcaacatcgtatatttacttattaatacata 3630 

:MI III I I: I III I I I I: I I III MM hi 
Db 833 TWATTATAAWATTTTTWAAAAAAAAAAAATTTAATTYTTTTTNAAAATAAAAAAWAAAWA 774 

Qy 3631 atttatcataattttatggaaattgagaccaagaaacattaagagaacaaattctataac 3690 

::: I I ::| II Ihll I II Mill II : I -II I II 
Db 773 WWWAAAAAATWWTAAATATAAWTTAAATNCATAAAACAAAAAAWAATTWHATAAAAAAAA 714 

Qy 3691 aaagacaatttagaaaaaaatgtacttttaggtaattttaagtactcttaaccaaacaca 3750 

III I II I lllllll I MM: |: |: ::: : II :|:: h: 
Db 713 AAAAAAAAAAAAAAAAAAAANATTTTTTTHACAAMMAMMAMMYMMMMCAAAAMAMMAAVM 654 

Qy 3751 aaaattcaaatcaaatgaactaaataagataatataacatacggaacatcttacttgtaa 3810 
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gb_gssl3:* 
gb_gssl4:* 
gb„gssl5:* 
gb_gssl6:* 
gb_gssl7 :* 
gb_gssl8:* 
gb„gssl9:* 
em_gssl3:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



Result 

NO. 



SUMMARIES 



Query 

Score Match Length DB ID 



Description 



1 


145.6 


2.6 1 


01 122 


CNS0OEVL 


AL069706 Drosophil 


2 


142 


2.6 1 


01 122 


CNS0OEVL 


ALQ697Q6 Drosophil 


3 


126.2 


2.3 1 


01 122 


CNS0039G 


AL063921 Drosophil 


4 


121.6 


2.2 1 


01 123 


CNS0167M 


AL106396 Drosophil 


5 


119.8 


2.2 1 


01 122 


CNS00EO7 


AL069440 Drosophil 


6 


118 


2.1 1 


01 122 


CNS00EO7 


AL069440 Drosophil 


7 


117 


2.1 1 


87 120 


B11102 


B11102 F19C22-T7 I 


8 


116.8 


2,1 1 


01 122 


CNS00KAE 


AL077628 Drosophil 


9 


115.2 


2.1 1 


87 120 


B11102 


B11102 F19C22-T7 I 


10 


115 


2,1 1 


01 122 


CNS0021J 


AL061936 Drosophil 


11 


114.2 


2.1 1 


01 122 


CNS003BD 


AL064091 Drosophil 


12 


113.2 


2.1 1 


01 122 


CNS003BD 


AL064091 Drosophil 


13 


112.6 


2,0 


76 122 


CNS009G1 


AL053529 Drosophil 


14 


111.2 


2.0 1 


01 122 


CNS0021J 


AL061936 Drosophil 


15 


110.8 


2.0 1 


01 123 


CNS0167M 


AL106396 Drosophil 


16 


110.2 


2.0 1 


25 123 


CNS0161D 


AL106171 Drosophil 


17 


109.8 


2.0 1 


01 122 


CNS0039G 


AL063921 Drosophil 


18 


109.8 


2.0 1 


01 123 


CNS0145U 


AL103740 Drosophil 


19 


109.6 


2.0 1 


01 122 


CNS0OEPO 


AL069493 Drosophil 


20 


109.4 


2.0 1 


01 122 


CNS00BO1 


AL057419 Drosophil 


21 


108.2 


2.0 1 


01 122 


CNS000B8 


AL063632 Drosophil 


22 


108.2 


2,0 1 


01 122 


CNS003BB 


AL064089 Drosophil 


23 


108 


2.0 ' 


34 122 


CNS010MP 


AL099163 Drosophil 


24 


108 


2.0 


36 122 


CNS01100 


AL099642 Drosophil 


25 


107 


1.9 


35 120 


B10881 


B10881 F24H6-Sp6,l 


26 


107 


1.9 1 


01 122 


CNS00EPO 


AL069493 Drosophil 


27 


106.8 


1.9 ' 


18 102 


AQ416310 


AQ416310 RPCI-11-1 


28 


106.2 


1.9 1 


01 122 


CNS001FB 


AL060732 Drosophil 


29 


106 


1.9 


28 113 


AQ739398 


AQ739398 HSJ482_B 


30 


105.8 


1.9 


36 122 


CNSQ110Q 


AL099642 Drosophil 


31 


105.2 


1,9 1] 


01 122 


CNS0042W 


AL055440 Drosophil 


32 


105 


1.9 


28 122 


CNS00DKY 


AL071865 Drosophil 


33 


105 


1.9 1] 


01 122 


CNS00KAE 


AL077628 Drosophil 


34 


104.4 


1.9 i: 


01 122 


CNS003BB 


AL064089 Drosophil 


35 


104.4 


1.9 13 


01 122 


CNS003DX 


AL064587 Drosophil 


36 


103.8 


1.9 1] 


01 122 


CNS000B8 


AL063632 Drosophil 


37 


103.6 


1.9 1] 


01 123 


CNS0145U 


AL103740 Drosophil 


38 


103.2 


1,9 1] 


01 122 


CKS003DQ 


AL064580 Drosophil 


39 


102.4 


1.9 f 


90 93 


AO026918 


AQ026918 CIT-HSP-2 


40 


101.8 


1,8 


76 122 


CNS009G1 


AL053529 Drosophil 


41 


101.6 


1.8 1] 


01 122 


CNS00FYG 


AL071206 Drosophil 


42 


101.4 


1,8 < 


28 122 


CNS00DKY 


AL071865 Drosophil 


43 


101.4 


1.8 < 


96 122 


CNS0OFUH 


AL071063 Drosophil 


44 


101,2 


1,8 1 


05 122 


CNS00KHX 


AL077798 Drosophil 


45 


101 


1.8 { 


28 113 


AQ739398 


AQ739398 HSJ482J 



RESULT 1 
CNS00EVL 

LOCUS , CNS00EVL 1101 bp DNA GSS 04-JUN-1999 

DEFINITION Drosophila melanogaster genome survey sequence T7 end of BAC: . 

BACR29B23 of RPCI-98 library from Drosophila melanogaster (fruit 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 
ORGANISM 



FEATURES 
source 



BASE COUNT 
ORIGIN 



fly), genomic survey sequence. 
AL069706 

AL069706.1 61:4949849 
GSS. 

fruit fly. 

Drosophila melanogaster 

Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 

Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; 

Muscomorpha; Ephydroidea; Drosophil idae; Drosophila. 

1 (bases 1 to 1101) 

Genoscope. 

Direct Submission 

Submitted (02-JUN-1999) Genoscope • Centre National de Sequencage : 
BP 191 91005 EVRY cedex - FRANCE (E-mail : seqref?genoscope.cns.fr 
* web : ww.genoscope.cns.fr) 

Determination of this BAC-end sequence was carried out as part of a 
collaboration with the Berkeley Drosophila Genome Project (BDGP) . 
The BDGP is constructing a physical map of the Drosophila 
melanogaster genome using these BACs, For further information 
please see http://www.fruitfly.org The BDGP Drosophila 
melanogaster BAC library was prepared by Kazutoyo Osoegawa and 
Aaron Mammoser in Pieter de Jong's laboratory in the Department of 
Cancer Genetics at the Roswell Park Cancer Institute in Buffalo, 
NY. The library is named RPCI-98 and was constructed by partial 
EcoRI digestion of Drosophila DNA provided by the BDGP from the 
isogenic strain y2; cn bw sp, the same strain used for the BDGP's 
PI and EST libraries. A more detailed description of the library 
and how to order individual BAC clones, the entire library* or 
filters for hybridization from the BACPAC Resource Center can be 
found at http://bacpacmed.buffalo.edu/drosophilajDac.htiii. 

Location/Qualifiers 

1. .1101 

/organism- "Drosophila melanogaster " 
/db_xref-"taxon:7227" 
/clone_lib-"RPCI-98" 
/clone- "BACR29B23" 
/note- "end : T7* 
419 a 91 c 60 g 299 t 232 others 



Query Match 
Best Local Similarity 



2.6%; 
■TV- 



Score 145,6; DB 122; Length 1101; 
Pred. No. 9.2e-13; 



Matches 


249; Conservative 130; Mismatches 260; Indels 5; Gaps 2; 


Qy 3326 


taactttaaattacaagcataatattaaattttgaatcaattaatttttatttctattat 

1 1:1:!::: : :| :|:: 1 :|||:|| :| :| II: :::: II 1 :::: 


3385 


Db 457 


516 


Qy 3386 


tttaattaatttagtctattttttcaaaataaaatttaaatctaaataaaaataattttt 
:|:| 1: I: : 1 Ihl Ml: llllllll: 1 hi Ihlllll : 
WTWATTWTTTWAmWTAWTAAAAAAAAAWATAATTTAAAWWAATAWATTAAWAATTTAW 


3445 


Db 517 


576 


Qy 3446 


ccttaatgttgaaacaactcatgttatacttcaaaattataagtatt-atatttaccttg 

:: II II 1 : 1 : : II 1 MM II! I I! Ill II 
AAWWTATATTAATWTATAAATWTWATTAATATAAAAAAATATTTTTTWATAAAATTTTTA 


3504 


Db 577 


636 


Qy 3505 


atgatttatttattagtatattaattctgattataattatggtgggatacaatcgctttc 
II lllll III:! I I III:! :::!lllll :| 1 
ATAATTTMTTAWTTATTAMTMWTWATTWWWTMTTAMTAATTTWAMTAWAAAAAA 


3564 


Db 637 


696 


Qy 3565 


cactaaatattttaactatgatttataaatttatttcaacatcgtatatttacttattaa 

1 III 1: Ml II II MIM 1 : I I MM: I 
AAAAAAAAAWATWAAWAATWATAWAIAAWTTAAAAWAATAAAAWAAWAATWAWWATAATA 


3624 


Db 697 


756 


Qy 3625 


tacataatttatcataattttatggaaattgagaccaagaaacattaagagaacaaattc 
1: III 1 :| :||::: |:: :: 1 : 1 :|| I: |: : :l Mil 
TWWATATATAWTTWTAWWWATWWAWWWTATAW- ■ - ■ AT AWAAT AWAAWAWAWAT AAAT AW 


3684 


Db 757 


812 


Qy 3685 


tataacaaagacaatttagaaaaaaatgtacttttaggtaattttaagtactcttaacca 
:! :|| : :|: II :| :| II :: 1 :| II MIM M I 
ATAWATWAAAWAWAWATAWWATWATATAWWAATAWAWAAAAAATWTAATATWAATWATAA 


3744 


Db 813 


872 
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Db 3812 TTTGTAAATTTTAAAATAATCACATTTTGTTTATTTCTTTTTTATCGATAATATT" -GG 3756 

Qy 3392 taatttagtctattttttcaaaataaaatttaaatctaaataaaaataatttttccttaa 3451 

■ I II 1 1 1 1 1 1 1 1 1 1 1 I I Nil I II II I I Mil III 
Db 3755 TGGATTTGTCTATTTTTTTAGGAATTCATTTTATTATGTATTATCACTTTTTTGTTTTAT 3696 

Qy 3452 tgttgaaacaactcatgttatacttcaaaattataagtattatatttaccttgatgattt 3511 

III I I II I I II I I II MM 
Db 3695 TCATAATTATTTTGAAAATAGTAAATACCGTGTAAATATACAAACCTAAAAATGTTATTA 3636 

Qy 3512 atttattagtatattaattctgattataattatggtgggatacaatcgctttccactaaa 3571 

I II I III I II II I II I II I III I III Mil 

Db 3635 ACTTTTAAGTTTTTTTTTTTTTTTTTTTTTTTTATATTAAGAATAATTGTAACCATTAAA 3576 

Qy 3572 tattttaactatgatttataaatttatttcaacatcgtatatttacttattaatacataa 3631 

Mil I II Ml II Mil III I I III III II I Ml 
Db 3575 TATTGGAGTATAAATAAATATATATATTATAAC - -GAGACAATTAGTTAAAAAAAAATAG 3518 

Qy 3632 tttatcataattttatggaaattgagaccaagaaacattaagagaacaaattctataaca 3691 

II I I II I I III I I II I I II I I I I III 

Db 3517 TTAAAAAAAAATCGTTAAAAAAAAATATGAAAATAAATGGATATATAATTGAATGAATAA 3458 

Qy 3692 aagacaatttagaaaaaaatgtacttttaggtaattttaagtactcttaaccaaacacaa 3751 

I I II III I III II II I Mill lllll II III II 
Db 3457 CATAAAA— AGATGACAATTTATCAAACTGTTAATTTAAAATAACTTAATCATACAAAA 3401 

Qy 3752 aaattcaaatcaaatgaactaaataagataatataacatacggaacatcttacttgtaat 3811 

III II III I III MM II III I II I 

Db 3400 AAAAAGGAACAAAAACAGGAAAAAGGAATAAAGTGTAAAGAAACACAAACAATTTAAAGA 3341 

Qy 3812 cttacattcccataattttattatgaaaaataatcttatattactcgaactaaatgttgt 3871 

II III IMIIIII III II HUM I II II I I 

Db 3340 CAAGCGAATTTATGAATTTATTATTTAAAGGTATATTATATATATAATCATAGATAATAT 3281 

Qy 3872 cacaaattattatctaaataaagaaaaacacttaatttttataacattttttcatatatt 3931 

III I I II III I I I I I I III II II 
Db 3280 TTAAAAGCAAAAAAAAGACAAACATATATCAGATTTGTATGTAATATAAGAATAATAAAG 3221 

Qy 3932 tgaaagattatattttgtatatttacgtaaaaatatt 3968 

II I M III II I Ml II III 
Db 3220 TGGATATATACGTTTATTAAAATTAGTTCCAGTCATT 3184 



RESULT 15 
PCT-OS92- 00018 -1/c 
Sequence 1, Application PC/TUS9200018 
GENERAL INFORMATION: 

APPLICANT: Hoffman, Stephen L. 
APPLICANT: Charoenvit, lupin 
APPLICANT: Hedstrom, Richard 
APPLICANT: Khusmith, Srisin 
APPLICANT: Rogers IV, William 0. 

TITLE OP INVENTION: Protective malaria sporozoite surface protein 
TITLE OF INVENTION: immunogen and gene encoding 
NUMBER OF SEQUENCES: 2 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: A. David Spevack 

STREET: NMRDC Building 1 T-12 National Naval 

STREET: Medical Center 

CITY: Bethesda 

STATE: MD 

COUNTRY: USA 

ZIP: 20814-5044 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.24 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: PCT/US92/00018 

FILING DATE: 19920103 

CLASSIFICATION: 424 
ATTORNEY/AGENT INFORMATION: 



NAME: Spevack, Avram D. 

TELECOMMUNICATION INFORMATION: 
TELEPHONE: (301) 295-6759 
TELEFAX: (301) 295-4033 
[FORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 4673 base pairs 

TYPE: NUCLEIC ACID 
STRANDEDNESS: double 
TOPOLOGY: linear 
MOLECULE TYPE; DNA (genomic) 
HYPOTHETICAL: N 
ANTI-SENSE: N 
ORIGINAL SOURCE: 

ORGANISM: Plasmodium yoelii 

STRAIN: 17X(NL) 

DEVELOPMENTAL STAGE: erythrocytic stage 

TISSUE TYPE: Blood 

CELL TYPE: erythrocytic stage 
IMMEDIATE SOURCE: 
LIBRARY: Pylambdagtll-2-7 kb genomic expression 
CLONE: PylO.llll 
FEATURE: 
NAME/KEY: CDS 
LOCATION: 718.. 3195 
OTHER INFORMATION: 
PCT-US92-00018-1 



Query Match 1.44; Score 79; DB 6; Length 4673; 

Best Local Similarity 46.84; Pred. No. 1.2e-05; 

Matches 354; Conservative 0; Mismatches 395; Indels 8; Gaps 3; 

Qy 3212 atgtaatattatattttaaaataaaattatgttatttagattcttaatattttggagcat 3271 

lllll III II MM I II I I II I II MM I I 
Db 3932 AAGGATGATAATACTTCAAAAAATCATGAGCCAACTTTAGATATTCTCTTTTTTAACATT 3873 

Qy 3272 tccatactataatttcgtaacataatattaaaatatagtaatataaagtgtaattaactt 3331 

I I I I III I I I III I III II I I 
Db 3872 CCATCATTTTTTTTTATCACACTTTTTAGTTCATAAAACTTAAGACCATTATTTTTATGT 3813 

Qy 3332 taaattacaagcataatattaaattttgaatcaattaatttttatttctattattttaat 3391 

ill i nil i i iii i i ii mum ii nil 

Db 3812 TTTGTAAATTTTAAAATAATCACATTTTGTTTATTTCTTTTTTATCGATAATATT---GG 3756 

Qy 3392 taatttagtctattttttcaaaataaaatttaaatctaaataaaaataatttttccttaa 3451 

I II lllllllllll I I MM Ml III I MM III 
Db 3755 TGGATTTGTCTATTTTTTTAGGAATTCATTTTATTATGTATTATCACTTTTTTGTTTTAT 3696 

Qy 3452 tgttgaaacaactcatgttatacttcaaaattataagtattatatttaccttgatgattt 3511 

III I I II I I II I I II I III 
Db 3695 TCATAATTATTTTGAAAATAGTAAATACCGTGTAAATATACAAACCTAAAAATGTTATTA 3636 

Qy 3512 atttattagtatattaattctgattataattatggtgggatacaatcgctttccactaaa 3571 

I II I III I II II I II I III III I III llll 

Db 3635 ACTTTTAAGTTTTTTTTTTTTTTTTTTTTTTTTATATTAAGAATAATTGTAACCATTAAA 3576 

Qy 3572 tattttaactatgatttataaatttatttcaacatcgtatatttacttattaatacataa 3631 

Mil I II III II llll III I I III Ml II I Ml 
Db 3575 TATTGG AGT ATAAATAAAT ATAT ATATTAT AAC - - GAGACAATTAGTTAAAAAAAAATAG 3518 

Qy 3632 tttatcataattttatggaaattgagaccaagaaacattaagagaacaaattctataaca 3691 

II I I II I I III I I II I I II I I I I I II 

Db 3517 TTAAAAAAAAATCGTTAAAAAAAAATATGAAAATAAATGGATATATAATTGAATGAATAA 3458 

Qy 3692 aagacaatttagaaaaaaatgtacttttaggtaattttaagtactcttaaccaaacacaa 3751 

I I II III I III II II I lllll lllll II III II 
Db 3457 CATAAAA" - -AGATGACAATTTATCAAACTGTTAATTTAAAATAACTTAATCATACAAAA 3401 

Qy 3752 aaattcaaatcaaatgaactaaataagataatataacatacggaacatcttacttgtaat 3811 

Ml II III I III llll II III I II I 
Db 3400 AAAAAGGAACAAAAACAGGAAAAAGGAATAAAGTGTAAAGAAACACAAACAATTTAAAGA 3341 
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0y 3362 tcaattaatttttatttctattattttaattaatttagtctattttttcaaaataaaatt 3421 

Db 652 TTTTTTCTTTTTTTTTTTTTTTTTTTTTTTTTT 593 

Qy 3422 taaatctaaataaaaataatttttccttaatgttgaaacaactcatgttatacttcaaaa 3481 

II I II llllllll I I II I I I II I II 
Db 592 AAACTAAGAAAAAAAATAAATAAATGAATAAATTAATAAATAAATATATAAATAAAATTA 533 

Qy 3482 ttataagtattatatttaccttgatgatttatttattagtatattaattctgattataat 3541 

I II I Mill II II MM II llll I II II II I 
Db 532 TAGGAACCACAATATTGGGGAGTATTATATATTGTGTATAATATATAGGATGGTTTTATT 473 

Qy 3542 tatggtgggatacaatcgctttccactaaatattttaactatgatttataaatttatttc 3601 

I I II II llllllll II II I 

Db 472 ATAAGAAGTGTAAAACTATATTAATGTGTACACATCAAAATATTAATAATTGTATTCATA 413 

Qy 3602 aacatcgtatatttacttattaatacataatttatcataattttat - -ggaaattgagac 3659 

II I I II II I II I I llll MINIMI llll I II 

Db 412 TTAATTGGAAATATATTAATAAGTTTTATATTTCAAGTAATTTTATAAACAAATGAACAC 353 

Qy 3660 caagaaacattaagagaacaaattctataacaaagacaatttagaaaaaaatgtactttt 3719 
Db 352 ACAMCATA^ 293 

Qy 3720 aggtaattttaagtactcttaaccaaacacaaaaattcaaatcaaatgaactaaataaga 3779 

I III I MM I I I II III I Mill III III 

Db 292 ATGTATTGTTAATAAATATAAAGAAAAAAAAAAAAAAAAAAAGTTTTTTATCTATGTTAT 233 

Qy 3780 taatataacatacggaacatcttacttgtaatcttacattcccataattttattatgaaa 3839 

MUM I I llll III I II II III III 

Db 232 TAATATGAATAATCATTATATATACTACATATCAAATATAAATATTTTTTATCTTTTTAT 173 

Qy 3840 aataatcttatattact 3856 

I I II Mill I 
Db 172 TTTTTTATTTTATTATT 156 



RESULT 12 
US-08-883-795A-36 
Sequence 36, Application US/08883795A 
Patent No. 5985607 
GENERAL INFORMATION: 
APPLICANT: Delcuve, Genevieve 
APPLICANT: Awang, Gregor 

TITLE OF INVENTION: Recombinant DNA Molecules and Expression 
TITLE OF INVENTION: Vectors for Tissue Plasminogen Activator 
NUMBER OF SEQUENCES: 39 
CORRESPONDENCE ADDRESS: 



STREET: 40 King street West 

CITY; Toronto 

STATE: Ontario 

COUNTRY: Canada 

ZIP: M5H 3Y2 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/883 ,795A 

FILING DATE: 27-JUN-1997 

CLASSIFICATION: 435 
ATTORNEY/AGENT INFORMATION: 

NAME: Gravelle, Micheline 

REGISTRATION NUMBER: 40,261 

REFERENCE/DOCKET NUMBER: 7841-062 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (416) 364-7311 

TELEFAX: (416) 361-1398 
INFORMATION FOR SEQ ID NO: 36: 
SEQUENCE CHARACTERISTICS: 



LENGTH: 665 base pairs 
TYPE: nucleic acid 
STRANDEDNESS: Single 
TOPOLOGY: linear 

MOLECULE TYPE: CDNA 

ORIGINAL SOURCE: 
ORGANISM: Homo sapiens 

IMMEDIATE SOURCE: 
CLONE: Rh 32 
38-883-795A-36 



Query Match 1,5%; Score 82.8; DB 4; Length 665; 

Best Local Similarity 49,0%; Pred. No. 2.1e-06; 

Matches 285; Conservative 0; Mismatches 287; Indels 10; Gaps 2; 

Qy 4782 acttggcttgatttatgttatgttatgtattttactttaatgatattgcatgtattgtta 4841 

I III II I II I I II III II I III I I llll II 
Db 14 AATTGTGTTTTTATAATTAAATATTTTATAATTAAAATATTTATAATTAAAATATTTATA 73 

Qy 4842 atttaacattgcttgatcattatactcttctactattaattataaatggcactgttttgt 4901 

III II III I II I III I I I II II III I I III 
Db 74 ATTAAATATTTTATAATTAAAATATTTATAATTAAATATTTTATAATTAAAATATTTATA 133 

Qy 4902 ttaaactttttacaagttaagacatgtataaatatatgacaatataattacaagtttta'g 4961 

I II I III I llll I II lllll II II llll III 
Db 134 ATTAAATATTTTATAATTAAAATATTTATAATTAAATATTTTATAATTAAAATATTTATA 193 

Qy 4962 ttcaatgttagctatcttagtatgttattgatgatcttaattacatttaaacaaattcca 5021 

I II I III II II I II I I III I lllll I II I 
Db 194 ATTAAATATTTTATAATTAAAATATTTATAATTAAATATTTTATAATTAAAATATTTATA 253 

Qy 5022 cttaaaattttaataaataa taacaaataattattgtaatataatacattaaat 5075 

lllll III llll III II I III II I III I II I 

Db 254 ATTAAATATTTTATAATTAAAATGTTTATAATTAAATATTTTATAATTAAAATGTTTATA 313 

Qy 5076 gcaacaaaaaatgaaataaataaaataaaatagcaaataattgttataatattgtaatat 5135 

III I I III III III I II I II I I III 
Db 314 ATTACATATTTTATAATTAAAATGTTTATAATTACATATTTTATAATTAAAATGTTTATA 373 

Qy 5136 aatatgtaccatattcttaactgaaatagggtctaacctataatccctaaaatttcagtt 5195 

I II II III llll II I I I I II MM I : 
Db 374 ATTACATATTTTATAATTAAAATGTTTATAATTACATATTTTATAATTAAAATGTTTATA 433 

Qy 5196 taaatatttttatacctgccatattattagaactcttt — ttaaatatattaaaattt 5251 

I II III I llllll I 1 1 I III III llll III llll 
Db 434 ATTACATATTTTATAATTACATATTTTATAAAGTATTTATAATTACATATTTTATAATTA 493 

Qy 5252 taattataccaatttaatttaaactattaattatcttaactaaaatctaaaattttattt 5311 

I I I MM II I II I I I II II I II I I Ml 

Db 494 AAGTATTTATAATTACATATTTTATAATTAAAGTATTTATAATTACATATTTTATAATTC 553 

Qy 5312 aacctattaattaaattcctaattatcttatctaatttaaaa 5353 

II I III III II I II llllll 

Db 554 AATATTTTATAAATAGTTAAAAAGACGAGGAAAAAATTAAAA 595 



RESULT 13 
US-08-883-795A-36/C 
Sequence 36, Application US/08883795A 
Patent No, 5985607 
GENERAL INFORMATION: 
. APPLICANT : Delcuve, Genevieve 
APPLICANT: Awang, Gregor 

TITLE OF INVENTION; Recombinant DNA Molecules and Expression 
TITLE OF INVENTION; Vectors for Tissue Plasminogen Activator 
NUMBER OF SEQUENCES: 39 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: BERESKIN & PARR 

STREET: 40 King Street West 

CITY: Toronto 

STATE: Ontario 

COUNTRY: Canada 



Tue Sep 5 07:22:56 2000 



us-08-984-1 



099-11. rni 



Page 



H19 acagaataatgttaaaatgaaattaaaataaggtggcctggtcacacacacaaaaaaaaa 3178 

I I Mill I III II 1 1 1 1 1 1 1 I I I I I lllllllll 

1302 ATAAAATAAMTGAAAGTGTGGGAAAMTAAAMTTAAAMTTAAAMTAAAAAAAAAAA 2243 

H79 ctaatgttggttggttgaattttatattacggaat-gtaatattatattttaaaataaaa 3237 

II II II I I I I I I I I III I II lllll 

1242 AAAAGAAATTTTAAAAAAAAAAAAAAAAAAAAATTAATCAAAAAATAAATATAATTAAAA 2183 

1238 ttatgttatttagattcttaatattttggagcattccatactataatttcgtaacataat 3297 

II I I I I I III llll III I III I III II 

1182 TTGTCATGCCAAAACTGATAAATATTTGATATATTATCCAATATTTATAAATAAGGTATA 2123 

1298 attaaaatatagtaatataaagtgtaattaactttaaattacaagcataatattaaattt 3357 

llll I llll II I llll I I III I I I II I III 
1122 ATTAGA--TAGAGAATAAAATTTTAAATTTATTAAAAAAAAAATCAAAAAAAACCAAAGT 2065 

1358 tgaatcaattaatttttatttctattattttaattaatttagtctattttttcaaaataa 3417 

III II I I III llll II III I I lllllllll! Ill 
!064 AMTAATATTTATAATGAGGGTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTCAAAGTAA 2005 

418 aatttaaatctaaataaaaataatttttccttaatgttg— aaacaactcatgttatac 3474 

II III III lllll I I II llll llll II III I 

!004 AAAAAAAAAAAAAAAAAAAAAAGAAATAGAAAAAAGTTGGTTAAACTACATTAGTTTTTT 1945 

475 ttcaaaattataagtattatatttaccttgatgatttatttattagtatattaattctga 3534 

I II II I I I I II II I II Ml I I llll 
944 ATAGTTTTTGCATATTTAAAAATAACTTTTAATTTTAAATTGATTTTTAATTATGAGATC 1885 

535 ttataattatggtgggatacaatcgctttccactaaatattttaactatgatttataaat 3594 

I llll I I II I I III I II I I III 
,884 TMTAAAAAAAAAAAATTTTAAAATTTAAAAAAAAAAGAAAAAAAAAAAAAAAGTAGAAT 1825 

595 ttatttcaacatcgtatatt tacttattaatacataatttatcataattttatg 3648 

lllll II I lllll I I II III I II llll I II 
,824 TTATTAAAAATTAAAATATTATTCAATCTTAATAAATTAAGTATATATCGATAGGCAATT 1765 

649 gaaattgagaccaagaaacattaagagaacaaattctataacaaagacaatttagaaaaa 3708 

I II I I I I II II I I II I I I III I 
764 TATTTTTATATCTATCTAAAAAAAAACTAGGAAAAATGAATGTCATCAAATAGTATTTTA 1705 

709 aatgtacttttaggtaattttaagtactcttaaccaaacacaaaaattcaaatcaaatga 3768 

I I llll I III I! I I I III lllll II I llll 
.704 ACATTTTTTTTTTTTTTTTTAAAAAAAAAGTGTCATGACAAAAAAAAAAAAGTGTCATGA 1645 

1769 actaaataagataatataacatacggaacatcttacttgtaatcttacattcccataatt 3828 

III II I II I II I I II III II I II I I! 

644 CAAAAAAAAAAAAAAAAAAGAGGGGAAAGTAATTATAACTAGGTTAGTTTTTTATAATTT 1585 

8 2 9 ttat tatgaaaaataatcttatattactcgaactaaatgttgtcacaaattattatctaa 3888 

III llll III I II! I I III I I II 

584 TTACAIATTTGTTAATAACTTTTAATTTTGAATCATATGATATTACATCGTCCCGTTGAA 1525 

889 ataaagaaaaacacttaatttttataacattttttcatatatttgaaagattatattttg 3948 

I III lllll I II lllll II llll III III III! I 
524 AAAAAAAAAAAAAATTTTTTTTTTCAAACATTTTCATTTTTTAAAAAATGATATAAAATT 1465 

949 tatatttacgtaaaaatatttgacatagattgagcaccttcttaacataa 3998 

I I II I llll I I I I I II II I III 
.464 TTAAACTAAACTATTTTATTAAATACAAACATATAACTTTATCTTAATCA 1415 



RESULT 10 
US-07-867-106-2 
Sequence 2, Application OS/07867106 
Patent No. 5389526 
GENERAL INFORMATION: 
APPLICANT: Slade, Martin B 

APPLICANT: Chang, Andy C M j 
APPLICANT: Williams, Keith L I 
TITLE OF INVENTION: Improved Plasraid Vectors for Cellular 
TITLE OF INVENTION: Slime Moulds of the Genus Dictyostelium 
NUMBER OF SEQUENCES: 19 



CORRESPONDENCE ADDRESS: 

ADDRESSEE: Woodcock Washburn Kurtz Mackiewicz & No. 5389526ris 
STREET: One Liberty Place 46th Floor 
CITY: Philadelphia 

STATE: PA 
COUNTRY : USA 
ZIP: 19103 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1,0, Version #1 .25 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/07/867,106 

FILING DATE: 19920625 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: AU PJ 7187 

APPLICATION NUMBER: PCT/AU90/00530 

FILING DATE: 02-NOV-1989 
ATTORNEY/AGENT INFORMATION: 

NAME: Feeney, Joanne Longo 

REGISTRATION NUMBER: 35,134 

REFERENCE/DOCKET NUMBER: RICE-0002 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 215-568-3100 

TELEFAX: 215-568-3439 
INFORMATION FOR SEQ ID NO: 2: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 5852 base pairs 

TYPE: NUCLEIC ACID 

STRANDEDNESS: single 

TOPOLOGY: linear 
MOLECULE TYPE: DNA (genomic) 
ANTI-SENSE: NO 
FEATURE: 

NAME/KEY: CDS 

LOCATION: 2378.. 5038 
FEATURE: 

NAME/KEY: CDS 

LOCATION: 2378.. 5038 
US-07-867-106-2 



Query Match 1.5%; Score 85; DB 1; Length 5852; 

Best Local Similarity 46.14; Pred, No. 1.2e-06; 

Conservative 0; Mismatches 460; Indels 12; Gaps 3 



II I III II llll III III llll II I I 



I I! Ill I I MINIMI I I II III I I I 



Matches 


oy 


3112 


Db 


1381 


oy 


3172 


Db 


1441 


Qy 


3232 


Db 


1501 


Qy 


3292 


Db 


1561 


Qy 


3352 


Db 


1621 


oy 


3409 


Db 


1681 


Qy 


3469 



I llll I II III II II I 



I II III III I 



I II I I! I I II III I III llll 



II I I 



i ii iini ii i ii mi ii i i i inn 

CTTGTCATGACACTTTTT 1 



I I! llll III I I Mill llll II I 



II I 



II I I II Mil II I I I III I II III I I III 
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COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentm Release #1.0, Version 11.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER; US/08/487 , 826B 

FILING DATE: 10-SEP-1993 

CLASSIFICATION: 435 
ATTORNEY/AGENT INFORMATION: 

NAME: Israelsen, Ned 

REGISTRATION NUMBER: 29,655 

REFERENCE/DOCKET NUMBER: NIH121 . 001CP1 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (619) 235-8550 

TELEFAX: (619) 235-0176 
INFORMATION FOR SEQ ID NO: 13: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 19124 base pairs 

TYPE: nucleic acid 

STRANDEDNESS: Single 

TOPOLOGY: linear 
MOLECULE TYPE: CDNA 
HYPOTHETICAL: NO 
ANTI-SENSE: NO 
-08-487-826B-13 



Query Match 1,8%; 
Best Local Similarity 44.14; 
Matches 877; Conservative 



Score 100.4; DB 4; 
Pred, No. 3.1e-09; 
3; Mismatches 1091; 



Length 19124; 

Indels 22; Gaps 10; 



Qy 1962 attgaaacgtttaagaatttttactactgcaaattcagaataagtgaatttgttttttag 2021 

Ml II II III I I II II II II I III I II I I 

Db 4660 ATTCATAATTTAGAGATTATGTAATATTGTTTATGTATCGTAATATATATTAATATAATT 4719 

Qy 2022 aaagattaaataagttagtattacgatttttagtttgatttggtggaaagtaatgtatgt 2081 

II Ml II I I I I I II II II II I I I III 

Db 4720 GTTTTTTTAGTATGTATGGTATTCTAATAATATATTCATATGTAGTCATAGTGTCAATGA 4779 

Qy 2082 ttttgaacataattatttgacaataattaagttttctagggaataaacggaaatatcttc 2141 

I I II III I II III I I II II III I I I II 

Db 4780 ATATAAAATATGGTATATTTATATTATTGTATATATTAAATAAGT AACACAGA - ACATTA 4838 

Qy 2142 ttcttttttgtaaaattactaatgcaagaacaaacaacgttttggggagcaaataatcta 2201 

I I I llll I II I I III I I II I II 

Db 4839 TATATAGTAATAAATAGAAGAAATAATATATTTTTATGTTATATATTATTAGTTATTATA 4898 

Qy 2202 gctttaagtagtcagtgtaactctcaaaatctggtcataacttctaggctgagtttgctg 2261 

I I III llll lllll I III I III I II 
Db 4899 AAGGGGAAAATTCATAATATTTATGAAAATTTTTGTATATGATATAGTTATAAGTTAAAA 4958 

Qy 2262 tgctacagtagtaagtctatagaaacttacctgacaaaacgacatgacgtcagggtcgaa 2321 

I I I III I I I III llll II III I II 

Db 4959 AAAAAAAAAAACAAGAACAAAAATGGAAAGCATAAAAAATGTTACTGTAATAGGATAAAA 5018 

Qy 2322 tctacaacttttcctttttcttcaattaacatatggttgattcaagttccgatctataat 2381 

lllll I III II II II llll III III 

Db 5019 TATATTATATAAAATGTTTATTTTATCTTAAAAAGGTTCCTATTATAACATTAAAAAAAA 5078 

Qy 2382 aatttattacgatttatcaatttcaattaccttatatcatcctattataaatataagtca 2441 

II I III I II I I III II II III III II 

Db 5079 TTTGTCCCATTTTATAAATAATTAACTACATTTACATAATGAAATTTCGATTTTGTGTTT 5138 

Qy 2442 gttcaattcagttttcgaaagttcccaaaaattttgaattttattaaatttattccctaa 2501 

II II I II II II lllll I I I II I 

Db 5139 TTTTGATGAATATTATGGACTAATTATTTATATGTGAATGCGTTCTATATAATAATAATA 5198 

Qy 2502 aaccgaaatagttatatctttcaaatttaagtttcatttttcaatccgattt-caatttc 2560 

I I II I II I I II II I I I II llll I 

Db 5199 ATTTTATTTAAAAAAATGAAAAATAAGAAATAAATATCCTGATTTTGTAGTTCCAATAGC 5258 

Qy 2561 atccttttataactctctattatctataattacataaatttcaaattaattttgaaatat 2620 

I I I II I II III llll I I II I I II I II II 
Db 5259 TTAATATAATTATGGACTCATATATATATTATATATATCTTTACAACAAGTAATAAGTAA 5318 



:621 ttacactttagtccctaagttcaaaactataaattttcactttagaaattaatcattttt 2680 

II III I III III II llll I I I llll I I 

1319 ATATTATTTTAATCTTAATAAGGAAAATAAAAATAATAAAATAAGAA- • -TACTGAATAA 5375 

!681 cacatctaagcatcaaatttaaccaaatgacacaaatttcatgattagttagatcaagct 2740 

I II I II I III lllll III I II II lllll I 

•376 TAAGTCATATTATACATTTTTTAAAAATGTAACATAATTACAAATACGTAACATGTATTA 5435 

1741 tttgagtcttcaaaacataaaaattacaaaaaaaaaacaaacttaaaatcatttatcaat 2800 

I I I I I II III I I I I III I I I I HUM llll I 

436 TAGAAATAATAAGAATTTAATATTAAGGATAAATATAAATATTTAAAATTATATTTTTTT 5495 

1801 ttgaacaacaaagcttggccgaatgctaagagcttaaaaatggcttcttttgtttctttt 2860 

II I I I I II II I I I I II II I llll 

1496 ATGTCAATTTATGTTATATTATATTATATTAACATGATTA-GTTTTTTGAAAAATATTTA 5554 

!861 tgttgcaaacggtggagagaagagggaaatgaagattgaccatatttttttattatgttt 2920 

I II I I III I I I II I II II I I 

555 AATATCATATAATAATAATAAATTAGTTAAAATAATAGTATTTCATACAAAATACTAACT 5614 

!921 taacatataatattaataatttaatcataattatactttggtgaatgtgacagtggggag 2980 

II I II II I III III llll II III llll llll 

1615 TATAAGTATATCATATAATATTATATATATATATATTTATGTGTTTTTGATTGGGTGTAT. 5674 

!981 atacgtaaagtattttaacattatactttttgcaagcagttggctggtctacccaagagt 3040 

III I I I I II I II I I I Mill 

675 ATAAGGCTATAAGTATATATGGGTTGTTCATTATATATTTATATGTGAATAGATACATAT "5734 

1041 gatcaaa gtttgagctgccttcaatgagccaatttttgcccataatggataaa 3093 

I II llll I II II II I III I I III 

735 AAGTTAATATATTTATTTGTGTATATGTCTGTGTTAAGATAGATATGCATTACAGTTAAG 5794 _ 

094 ggcaatttgtttagttcaactgctcacagaataatgttaaaatgaaattaaaataaggtg 3153 

II II III II I I III lllll I III I I 

1795 GGTTATAGTTTTTTTTTTTTTTTTTTGTACATATATATAAAAAATAGATAACTAACAATA 5854 

154 gcctggtcacacacacaaaaaaaaactaatgttggttggttgaattttatattacggaat 3213 

I I III I III MM I III lllll I I IL 

855 TGCATATTACAAGAATAATATTTGTATAAAATATATATATATATATATATATAAAGACAT 5914 

214 gtaatattatattttaaaataaaattatgttatttagattcttaatattttggagcattc 3273 

III I llll lllll I I I II I! I llll I II 

1915 -TAAAACTATACTAATAGGTAATTAGTTTTATTATATCATCCTTTTATTATTATAATTTT 5973 

274 catactataatttcgtaacataatattaaaatatagtaatataaagtgtaattaacttta 3333 

llll 111 I I I I II III llllllll II III II 

1974 TTTTGTTTTACTTCTTGTCGTTCTTTTTTGTTATTATAATATAACAAATATAAAACAATA 6033 

334 aattacaagcataatattaaattttgaatcaattaatttttatttctattattttaatta 3393 

I I I II I II I II I II I III I II II 

1034 TCAGTATTTGGAATATAAATAAATTTATTCTACATATATGCATATATATATATATATATA 6093 

394, atttagtctattttttcaaaataaaatttaaatctaaataaaaataattttt- - -cctta 3450 

i i iii i i i i i ii ii 1 1 i i i i mill i i 

,094 TATATATATATATATATATATATATATATATGTATGATTTTATACTATTTTTATACATGC 6153 

451 atgttgaaacaactcatgttatacttcaaaattataagtattatatttaccttgatgatt 3510 

II II I I I II HUM III III I II III llll I 

1154 ATTTTTATATATTTTAGTATATACTTTAAAGATATTATTAATATTTATATAGTAGCATAT 6213 

:5U tatttattagtatattaattctgattataattatggtgggatacaatcgctttccactaa 3570 

I III III I llll llll III I I II I 

1214 ATGTATTTATATTATAACAAATATTTTCATTTATATAAATATATAGAACATGAACATTTT 6273 

571 atattttaactatgatt-tataaatttatttcaacatcgtatatttacttattaatacat 3629 

ii 1 1 ii ii iii ii iiiii i iiii iiimiiii i ii 

1274 ATTAATAACTCATATTTGAATATATATATTTATAATGTGTATTTTTACTTATTTTTTTAT 6333 

630 aatttatcataattttatggaaattgagaccaagaaacattaagagaacaaattctataa 3689 

I I II 'llll II II II I I I I II II llll I I II 

1334 ATTATACAATAAAATTTTGAAATTCATAAAATGCATGAAATACATAAAAAAATACAACAA 6393 
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4244 gcagcggctcgacgtttattcgagacacaagcaacctcatcagagctcccacaattggct 4303 

lllllllllll! Illllll 1 1 1 ! 1 1 1 1 IIIIIIIIIIMIII! Illllllllll! 
133 TCAGCGGCTCGACATTTATTCCAGACACAMCAACCTCATCAGAGCTGCCACAATTGGCT 192 

4 3 04 tcaaaatacgaaaagcacgaagagtctgaatacgaaaagccagaatacaaacagccaaag 4363 

imimiimiiiii iiiimi milium mi 

193 TCAAAATACGAAAAGCACAAAGAGTCT GAATACAAACAACCAAAA 237 

4364 tatcacgaagagtactcaaaacttgagaagcctgaaatgcaaaaggaggaaaaacaaaaa 4423 

miimi iiiii inn iiiimm iiiimi miimiiiiiimi 

238 TATCACGAAAAGTACCCAAAACATGAGAAGCCTAAAATGCACAAGGAGGAAAAACAAAAA 297 

4424 ccctgcaaacagcatgaagagtaccacgagtcacacgaatcaaaggagcaaaaagagtac 4483 

lllllllllll llllllllllllllllllllll llllll Illllll Illllll 
298 CCCTGCAAACATCATGAAGAGTACCACGAGTCACGCGAATCGAAGGAGCACGAAGAGTAC 357 

4484 gagaaagaaaatctcgacgggcccgaa 4510 

II llllll I III III II 
358 GATAAAGAAAAACCCGATTTCCCCAAA 384 



RESULT 5 
OS-08-787-335-18 

Sequence 18, Application US/08787335 
Patent No. 5981834 
GENERAL INFORMATION: ■ 

APPLICANT: John, Maliyakal E. 

APPLICANT: Umbeck, Paul P. 

APPLICANT: Brill, Winston J. 

TITLE OF INVENTION: GENETICALY ENGINEERED COTTON PLANTS 
TITLE OF INVENTION: FOR ALTERED FIBER 
18 



ADDRESSEE: Quarles and Brady 

STREET: P.O BOX 2113 

STREET: FIRST WISCONSIN PLAZA 

CITY:' MADISON 

STATE: WISCONSIN 

COUNTRY: U.S.A. 

ZIP: 53701 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette - 3.50 inch, 800Kb storage 

COMPUTER: Apple Macintosh 

OPERATING SYSTEM: Macintosh 

SOFTWARE: Microsoft Word 4.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/787,335 

FILING DATE: 

CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/530,797 

FILING DATE; 

APPLICATION NUMBER: US 07/253,243 

FILING DATE; Q4-OCT-88 
ATTORNEY/AGENT INFORMATION: 

NAME: Nicholas J. Seay 

REGISTRATION NUMBER: 27,386 

REFERENCE/DOCKET NUMBER: 1122990245 
INFORMATION FOR SEQ ID NO: 18: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 1283 base pairs 

TYPE: nucleic acid 

STRANDEDNESS: single 

TOPOLOGY: linear 
MOLECULE TYPE: CDNA to UlRNA 
HYPOTHETICAL: no 
ANTI-SENSE: ' no 
ORIGINAL SOURCE: 

ORGANISM: Gossypium hirsutuni 
STRAIN: Coker 312 

DEVELOPMENTAL STAGE: 15 day old fiber cells 
tissue type: fiber cells 

IMMEDIATE SOURCE: 



LIBRARY: CKFB15 
CLONE: E9 
3-787-335-18 



Query Match 5.0%; Score 273.4; DB 4; Length 1283; 

Best Local Similarity 84.2%; Pred. No. 6.1e-39; 

Matches 326; Conservative 0; Mismatches 46; Indels 15; Gaps 

Qy 4124 aatacacgttcttttctttctatttgattaaccatggctcatagcattcgtcaccctttc 4183 

Db 13 ACTAAAAATTCTTTGCTTTCTATTTO 72 

Qy 4184 ttccttttccaacttttactcataagtgtctcactagtgaccggtagccacactgtttcg 4243 

iiiiiiimiiiimmiii i iimiim i i mm inn n m 

Db 73 TTCCTTTTCCAACTTTTACTCATTACTGTCTCACTAATAATCGGTAGTCACACCGTCTCG 132 
Qy 4244 gcagcggctcgacgtttattcgagacacaagcaacctcatcagagctcccacaattggct 4303 

minium iiiiiii iiiimi iiimiiiiimii iiiimiiin 

Db 133 TCAGCGGCTCGACATTTATTCCAGACACAAACAACCTCATCAGAGCTGCCACAATTGGCT 192 
Qy 4304 tcaaaatacgaaaagcacgaagagtctgaatacgaaaagccagaatacaaacagccaaag 4363 

iiiiiiiimimm mum iimimii iiiii 

Db 193 TCAAAATACGAAAAGCACAAAGAGTCT GAATACAAACAACCAAAA 237 

Qy 4364 tatcacgaagagtactcaaaacttgagaagcctgaaatgcaaaaggaggaaaaacaaaaa 4423 

iiiiiii iiiii mm imiiini mini iiiiiiiiiiiiiiiih 

Db 238 TATCACGAAAAGTACCCAAAACATGAGAAGCCTAAAATGCACAAGGAGGAAAAACAAAAA 297 
Qy 4424 ccctgcaaacagcatgaagagtaccacgagtcacacgaatcaaaggagcaaaaagagtac 4483 

iimimii iiimmimiiimiii mm iiiimm iiiiiiij 

Db 298 CCCTGCAAACATCATGAAGAGTACCACGAGTCACGCGAATCGAAGGAGCACGAAGAGTAC 357 

Qy 4484 gagaaagaaaatctcgacgggcccgaa 4510 

II Illllll I III III II 
Db 358 GATAAAGAAAAACCCGATTTCCCCAAA 384 



RESULT 6 
US-08-487-826B-13/C 
Sequence 13, Application US/08487826B 
Patent No. 5993827 
GENERAL INFORMATION: 
APPLICANT: Sim, Kim L. 
APPLICANT : Chitnis, Chetan 
APPLICANT: Miller, Louis H. 
APPLICANT: Peterson, David S. 
' APPLICANT: Su, Xin-zhaun 
APPLICANT: Wellems, Thomas E. 

TITLE OF INVENTION: BINDING DOMAINS FROM PLASMODIUM VIVAX 

TITLE OF INVENTION: AND PLASMODIUM FALCIPARUM ERYTHROCYTE BINDING PROTEINS 

NUMBER OF SEQUENCES: 45 



ADDRESSEE: Knobbe Martens Olson & Bear 

STREET: 620 Newport Center Drive 16th Floor 

CITY: Newport Beach 

STATE: California 

COUNTRY: US 

ZIP: 92660 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1,25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/487, 826B 

FILING DATE: 10-SEP-1993 

CLASSIFICATION: 435 
ATTORNEY/AGENT INFORMATION: 

NAME: Israelsen, Ned 

REGISTRATION NUMBER: 29,655 

REFERENCE/DOCKET NUMBER; NIH121 . 001CP1 
TELECOMMUNICATION INFORMATION: 
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TISSUE TYPE: fiber cells 
IMMEDIATE SOURCE: 
LIBRARY: CKFBlS 
CLONE: E9 
US-07-885-970A-17 



Query Match 5.0%; Score 273,4; DB 1; Length 1283; 

Best Local Similarity 84.2%; Pred. No. 6.1e-39; 

Matches 326; Conservative 0; Mismatches 46; Indels 15; Gaps 

Qy 4124 aatacacgttcttttctttctatttgattaaccatggctcatagcattcgtcaccctttc 4183 

i ii i nun Minimi i iiiiiiMiiiM i ii mi nun 

Db 13 ACTAAAAATTCTTTGCTTTCTATTTTGTAAACCATGGCTCATAACTTTTGTCATCCTTTC 72 

Qy 4184 ttccttttccaacttttactcataagtgtctcactagtgaccggtagccacactgtttcg 4243 

IMIIIIIIIIIIIIIIIIIII I MINIM! I I IIIIM Mill || IN 
Db 73 TTCCTTTTCCAACTTTTACTCATTACTGTCTCACTAATAATCGGTAGTCACACCGTCTCG 132 

Qy 4244 gcagcggctcgacgtttattcgagacacaagcaacctcatcagagctcccacaattggct 4303 

iiiiiiiinii mini minim iiimmiiiiiii immniii 

Db 133 TCAGCGGCTCGACATTTATTCCAGACACAAACAACCTCATCAGAGCTGCCACAATTGGCT 192 
Qy 4304 tcaaaatacgaaaagcacgaagagtctgaatacgaaaagccagaatacaaacagccaaag 4363 

1 1 1 1 1 1 m ii 1 1 1 1 1 1 1 ! mum iiiiimiii inn 

Db 193 TCAAAATACGAAAAGCACAAAGAGTCT GAATACAAACAACCAAAA 237 

Qy 4364 tatcacgaagagtactcaaaacttgagaagcctgaaatgcaaaaggaggaaaaacaaaaa 4423 

iiiiimi inn mm minim iiiiiii imiiiiimiimi 

Db 238 TATCACGAAAAGTACCCAAAACATGAGAAGCCTAAAATGCACAAGGAGGAAAAACAAAAA 297 
Qy 4424 ccctgcaaacagcatgaagagtaccacgagtcacacgaatcaaaggagcaaaaagagtac 4483 

i r 1 1 r 1 1 r 1 1 1 imiimimimimi mm iiiiiii minim 

Db 298 CCCTGCAAACATCATGAAGAGTACCACGAGTCACGCGAATCGAAGGAGCACGAAGAGTAC 357 

Qy 4484 gagaaagaaaatctcgacgggcccgaa 4510 

II 1 1 1 1 1 1 M | Ml III II 
Db 358 GATAAAGAAAAACCCGATTTCCCCAAA 384 



RESULT 2 
US-08-298-687A-17 
Sequence 17, Application OS/08298687A 
Patent No. 5521078 
GENERAL INFORMATION: 
APPLICANT: John, Maliyakal E. 
TITLE OF INVENTION: GENETICALLY ENGINEERING COTTON 
TITLE OF INVENTION: PLANTS FOR ALTERED FIBER 
NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Nicholas J. Seay, Quarles s Brady 

STREET: P.O. Box 2113, First Wisconsin Plaza 

CITY: Madison 

STATE: Wisconsin 

COUNTRY: USA 

ZIP: 53701 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Microsoft Word 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/298, 687A 

FILING DATE: 

CLASSIFICATION: 800 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/617,239 

FILING DATE: 21-NOV-1990 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/253,243 

FILING DATE: 04-OCM98B 
ATTORNEY/AGENT INFORMATION: 

NAME: seay, Nicholas J. 

» .4 * ♦ 



REGISTRATION NUMBER: 27,386 
TELECOMMUNICATION INFORMATION; 
TELEPHONE: (608) 283-2478 
TELEFAX: (608) 251-5139 
INFORMATION FOR SEQ ID NO: 17: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1283 base pairs 
TYPE: nucleic acid 

STRANDEDNESS: double 

TOPOLOGY: linear 

MOLECULE TYPE: CDNA ' 
HYPOTHETICAL: NO 
ANTI- SENSE: NO 
ORIGINAL SOURCE: 

ORGANISM: Gossypium hirsutum 

STRAIN: Coker 312 

DEVELOPMENTAL STAGE: 15 day old fiber cells 
TISSUE TYPE: fiber cells 

IMMEDIATE SOURCE: 
LIBRARY: CKFB15 
CLONE: E9 
US-08-298-687A-17 



Query Match 5.0%; 
Best Local Similarity 84,2%; 
Matches 326; Conservative 



Score 273.4; DB 1; Length 1283; 
Pred. No. 6 .le-39; 

); Mismatches 46; Indels 15; Gaps 1; 



Qy 4124 aatacacgttcttttctttctatttgattaaccatggctcatagcattcgtcaccctttc 4183 
Db 13 ACTA^AMTTCTTTGCTTTCTATTTTGT 72 
Qy 4184 ttccttttccaacttttactcataagtgtctcactagtgaccggtagccacactgtttcg 4243 

1 1 1 [i i ii 1 1 1 1 1 1 1 mi i 1 1 1 1 1 1 1 1 1 1 i i mm inn n in 

Db 73 TTCCTTTTCCAACTTTTACTCATTACTGTCTCACTAATAATCGGTAGTCACACCGTCTCG 132 
Qy 4244 gcagcggctcgacgtttattcgagacacaagcaacctcatcagagctcccacaattggct 4303 

iii ii iiiiiii mm! mum miiimiinm mimmii 

Db 133 TCAGCGGCTCGACATTTATTCCAGACACAAACAACCTCATCAGAGCTGCCACAATTGGCT 192 
Qy 4304 tcaaaatacgaaaagcacgaagagtctgaatacgaaaagccagaatacaaacagccaaag 4363 

iiiiiimiiiiiiiii immi minium inn 

Db 193 TCAAAATACGAAAAGCACAAAGAGTCT GAATACAAACAACCAAAA 237 

Qy 4364 tatcacgaagagtactcaaaacttgagaagcctgaaatgcaaaaggaggaaaaacaaaaa 4423 

mimmii urn mm imiiimi iiiiiii iiiiimmmim 

Db 238 TATCACGAAAAGTACCCAAAACATGAGAAGCCTAAAATGCACAAGGAGGAAAAACAAAAA 297 
Qy 4424 ccctgcaaacagcatgaagagtaccacgagtcacacgaatcaaaggagcaaaaagagtac 4483 

minimi iiNiiiiiiiiiiNiiiiM mm iiiinii i mini 

Db 298 CCCTGCAAACATCATGAAGAGTACCACGAGTCACGCGAATCGAAGGAGCACGAAGAGTAC 357 

Qy 4484 gagaaagaaaatctcgacgggcccgaa 4510 

II MMMI I III III II 
Db 358 GATAAAGAAAAACCCGATTTCCCCAAA 384 



RESULT 3 
US-08-530-797-18 
Sequence 18, Application US/08530797 
Patent No. 5597718 
GENERAL INFORMATION: 
APPLICANT: John, Maliyakal E. 
APPLICANT: Umbeck, Paul F. 
APPLICANT: Brill, Winston J, 

TITLE OF INVENTION: GENETICALY ENGINEERED COTTON PLANTS 
TITLE OF INVENTION: FOR ALTERED FIBER 
NUMBER OF SEQUENCES: 18 



ADDRESSEE: Quarles and Brady 
STREET: P.O BOX 2113 
STREET: FIRST WISCONSIN PLAZA 
CITY: MADISON 
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II III I III III II MM I I I Mil I I 

Db 5319 ATATTATTTTAATCTTAATMGGAAMTAAAAATAATAAAATAAGAA- - -TACTGAATAA 5375 

Qy 2681 cacatctaagcatcaaatttaaccaaatgacacaaatttcatgattagttagatcaagct 2740 

I I I I I I I I I I I I I I I I I I I I I I I I I. I I I I 

Db 5376 TAAGTCATATTATACATTTTTTAAAAATGTAACATAAITACAAATACGTAACATGTATIA 5435 

Qy 2741 tttgagtcttcaaaacataaaaattacaaaaaaaaaacaaacttaaaatcatttatcaat 2800 

I I I I I II III I M I III I I I I lllllll II I I I 

Db 5436 TAGAAATAATAAGAATTTAATATTAAGGATAAATATAAATATTTAAAATTATATTTTTTT 5495 

Qy 2801 ttgaacaacaaagcttggccgaatgctaagagcttaaaaatggcttcttttgtttctttt 2860 

II I I I I II III I I I I I II I I III 

Db 5496 ATGTCAATTTATGTTATATTATATTATATTAACATGATTA-GTTTTTTGAAAAATATTTA 5554 

Qy 2861 tgttgcaaacggtggagagaagagggaaatgaagattgaccatatttttttattatgttt 2920 

I II I I I II I I Ml I II II I I 

Db 5555 MTATCATATAATAATAATAAATTAGTTAAAATAATAGTATTTCATACAAAATACTAACT 5614 

Qy 2921 taacatataatattaataatttaatcataattatactttggtgaatgtgacagtggggag 2980 

II I II II I III III llll II III I III I I I I 

Db 5615 TATAAGTATATCATATAATATTATATATATATATATTTATGTGTTTTTGATTGGGTGTAT 5674 

Qy 2981 atacgtaaagtattttaacattatactttttgcaagcagttggctggtctacccaagagt 3040 

III I I I I II I II I I I I II I I 

Db 5675 ATAAGGCTATAAGTATATATGGGTTGTTCATTATATATTTATATGTGAATAGATACATAT 5734 

Qy 3041 gatcaaa gtttgagctgccttcaatgagccaatttttgcccataatggataaa 3093 

I II llll I II II II I III I I III 

Db 5735 AAGTTAATATATTTATTTGTGTATATGTCTGTGTTAAGATAGATATGCATTACAGTTAAG 5794 

Qy 3094 ggcaatttgtttagttcaactgctcacagaataatgttaaaatgaaattaaaataaggtg 3153 

Db 5795 G^TTATAGTTTTTTTTTTTTTTTTO 5854 

Qy 3154 gcctggtcacacacacaaaaaaaaactaatgttggttggttgaattttatattacggaat 3213 

I I III I II I III I I I I I Mill I I II 

Db 5855 TGCATATTACAAGAATAATATTTGTATAAAATATATATATATATATATATATAAAGACAT 5914 

Qy 3214 gtaatattatattttaaaataaaattatgttatttagattcttaatattttggagcattc 3273 

III I llll I I III I I I II II I llll I II 

Db 5915 -TAAAACTATACTAATAGGTAATTAGTTTTATTATATCATCCTTTTATTATTATAATTTT 5973 

Qy 3274 catactataatttcgtaacataatattaaaatatagtaatataaagtgtaattaacttta 3333 

I I I I III I II Ml III llllllll II III II 

Db 5974 TTTTGTTTTACTTCTTGTCGTTCTTTTTTGTTATTATAATATAACAAATATAAAACAATA 6033 

Qy 3334 aattacaagcataatattaaattttgaatcaattaatttttatttctattattttaatta 3393 

I I I II MM III II Mil I II II 

Db 6034 TCAGTATTTGGAATATAAATAAATTTATTCTACATATATGCATATATATATATATATATA 6093 

Qy 3394 atttagtctattttttcaaaataaaatttaaatctaaataaaaataattttt---cctta 3450 

I llllllll Mill I I II II Mill I I 

Db 6094 TATATATATATATATATATATATATATATATGIATGATTTTATACTATTTTTATACATGC 6153 

Qy 3451 atgttgaaacaactcatgttatacttcaaaattataagtattatatttaccttgatgatt 3510 

llll I I I II lllllll III III I II III Ml I I 

Db 6154 ATTTTTATATATTTTAGTATATACTTTAAAGATATTATTAATATTTATATAGTAGCATAT 6213 

Qy 3511 tatttattagtatattaattctgattataattatggtgggatacaatcgctttccactaa 3570 

I III III I llll llll III I llll 

Db 6214 ATGTATTTATATTATAACAAATATTTTCATTTATATAAATATATAGAACATGAACATTTT 6273 

Qy 3571 atattttaactatgatt-tataaatttatttcaacatcgtatatttacttattaatacat 3629 

II I I II II III II Mil I llll llllllllll I II 

Db 6274 ATTAATAACTCATATTTGAATAIATATATTTATAATGTGTATTTTTACTTATITTTTTAT 6333 

,Qy 3630 aatttatcataattttatggaaattgagaccaagaaacattaagagaacaaattctataa 3689 

I I II llll lllllll I I Ml II llll I I II 

Db 6334 ATTATACMIAAAATTTTGAAATTCATAAAATGCATGAAATACAIAAAAAAATACAACAA 6393 

Qy 3690 caaagacaatttagaaaaaaat-gtacttttaggtaattttaagtactcttaaccaaaca 3748 

II I, II III I II I II llll II III II . 



Db 6394 AACAAATGATAAAAACATTTTTATTAATATAATATAATAIAATAIAATAATATATTTTTC 6453 

Qy 3749 caaaaattcaaatcaaatgaactaaataagataatataacatacggaacatcttacttgt 3808 

I III I I II III MM II II II 

Db 6454 CTGTTATTTAIT1ATCATTTITTTTTIGATGCTATATATATIATTATATAATAAATTATA 6513 

Qy 3809 aatcttacattcccataattttattatgaaaaataatcttatattactcgaactaaatgt 3868 

I I! I II III II II MM II I III I II II 
Db 6514 ATATATA* • -ACAACAAAAATTAATAAIAAIAAIATACTACTTTTAATATAATACAACAA 6570 

Qy 3869 tgtcacaaattattatctaaataaagaaaaacacttaatttttataacattttttcatat 3928 

I I III IMII llll II I I MM I I III 
Db 6571 TACAAAGAATATGTATCTATATCAATTATATATATATGAATATATAAATATGATAGATAA 6630 

Qy 3929 atttgaaaga 3938 

I II III 
Db 6631 TATAGATAGA 6640 
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Qy 2782 cttaaaatcatttatcaatttgaacaacaaagcttggccgaatgctaagagcttaaaaat 2841 

I I II II lllll II I I I I III I II I I I I 

Db 6637 ATCTATATTATCTATCATATTTATATATTCATATATATATAATTGATATAGATACATATT 6578 

Qy 2842 ggcttcttttgtttctttttgttgcaaacggtggagagaagagggaaatgaagattgacc 2901 

I II I I II I I I I II II I I I 
Db 6577 CHTGTATTGTTGTATTATATTAAAAGTAGTATATTATTATTATTAATTTTTGTTGTTAT 6518 

Qy 2902 atatt — tttttattatgttttaacatataatattaataatttaatcataattatact 2957 

lllll III I III I I lllll II II II III Nil II I 
Db 6517 ATATTATAATTTATTATATAATAATATATATAGCATCAAAAAAAAAATGATAAATAAATA 6458 

Qy 2958 ttggtgaatgtgacagtggggagatacgtaaagtattttaacattatactttttgcaagc 3017 

I II I I III III INI I I lllll I 

Db 6457 ACAGGAAAAATATATTATTATATTATATTATATTATATTAATAAAAATGTTTTTATCATT 6398 

Qy 3018 agttggctggtctacccaagagtgatcaaagtttgagctgccttcaatgagccaattttt 3077 

III I II I- II I I I I II II I I III I 
Db 6397 TGTTTTGTTGTATTTTTTTATGTATTTCATGCATTTTATGAATTTCAAAATTTTATTGTA 6338 

Qy 3078 gcccataatggataaaggcaatttgtttagttcaactgctcacagaataatgttaaaatg 3137 

Mil I III II I lllll II III I I 
Db 6337 TAATATAAAAAAATAAGTAAAAATACACATTATAAATATATATATTCAAATATGAGTTAT 6278 

Qy 3138 aaattaaaataaggtggcctggtcacacacacaaaaaaaaactaatgttggttggttgaa 3197 

III III II II III III Mil Mil I I 
Db 6277 TAATAAAATGTTCATGTTCTATATATTTATATAAATGAAAATATTTGTTATAATATAAAT 6218 

Qy 3198 ttttatattacggaatgtaatattatattttaaaataaaattatgttatttagattctta 3257 

lllll II III I II II I III III II I I II II 
Db 6217 ACATATATGCTACT ATATAAATATT AATAATATCTTT AAAGTAT ■ ATACTAAAATAT ATA 6159 

Qy 3258 atattttggagcattccatactataatttcgtaacataatattaaaatatagtaatataa 3317 

III III lllll II II llll I Mill lllll 

Db 6158 AAAATGCATGTATAAAAATAGTATAAAATCATACATATATATATATATATATATATATAT 6099 

Qy 3318 agtgtaattaactttaaattacaagcataatattaaattttgaatcaattaatttttatt 3377 

I II I I I II I lllll llll I I III I I 
Db 6098 ATATATATATATATATATATATATGCATATATGTAGAATAAATTTATTTATATTCCAAAT 6039 

Qy 3378 tctattattttaattaatttagtctattttttcaaaataaaatttaaatctaaataaaaa 3437 

II llll I I llll I llll I II I llll I II llll 

Db 6038 ACTGATATTGITTTATATTTGTTATATTATAATAACAAAAAAGAACGACAAGAAGTAAAA 5979 

Qy 3438 taatttttccttaatgttgaaacaactcatgttatacttcaaaattataagtattatatt 3497 

II II III llll Ml I II I I II llll I 
Db 5978 CAAAAAAAATTATAATAATAAAAGGATGATATAATAAAACTAATTACCTATTAGTATAGT 5919 

Qy 3498 taccttgatgatttatttattagtatattaattctgattataatta — tggtgggata 3553 

I II I III III lllll II I II II II I II III 
Db 5918 TTTAATGTCTTTATATATATATATATATATATATTTTATACAAATATTATTCTTGTAATA 5859 

Qy 3554 caatcgctttccactaaatattttaactatgatttataaatttatttcaacatcgtatat 3613 

I I II MINI III I II I III III 
Db 5858 TGCATATTGTTAGTTATCTATTTTTTATATATATGTACAAAAAAAAAAAAAAAAAACTAT 5799 

Qy 3614 ttacttattaatacataatttatcataattttatggaaattgagaccaagaaacattaag 3673 

.11 III II I I II II I II II III II 
Db 5798 AACCCTTAACTGTAATGCATATCTATCTTAACACAGACATATACACAAATAAATATATTA 5739 

Qy 3674 agaacaaattctataacaaagacaatttagaaaaaaatgtacttttaggtaattttaagt 3733 

I I I I II I Ml Ml I II llll llll 
Db 5738 ACTTATATGTATCTATTCACATATAAATATATAATGAACAACCCATATATACTTATAGCC 5679 

Qy 3734 actcttaaccaaacacaaaaattcaaatcaaatgaactaaataagataatataacatacg 3793 

III I Mill! Ill I III llll II llll I llll 
Db 5678 TTATATACACCCAATCAAAAACACATAAATATATATATATATAATATTATATGATATAC- 5618 

Qy 3794 gaacatcttacttgtaatcttacattcccataattttattatgaaaaataatcttatatt 3853 

II MM I III Ml Mill I II I MM 

Db 5619 TTATAAGTTAGTATTTTGTATGAAATACTATTATTTTAACTAATTTATTATTATTATATG 5560 



Qy 3854 actcgaactaaatgttgtcacaaattattatctaaataaagaaaaacacttaatttttat 3913 
I I I I II III III II I llll I MM I I I I 

Db 5559 ATATTTAAATATTTTTCAAAAAACTAATCATGTTAATATAATATAATA-TAACATAAATT 5501 

Qy 3914 aacattttttcatatatttgaaagattatattttgtatatttacgtaaaaatatt 3968 

llll lllll II II I III II llll III I II 
Db 5500 GACAIAAAAAAATATAATTTTAAATATTTATATTTATCCTTAATATTAAATTCTT 5446 



RESULT 14 
V22740 

ID V22740 standard; DNA; 3701 BP, 

AC V22740; 

DT 28-SEP-1998 (first entry) 

DE Babesia microti BMNI-10 antigen sequence. 

KW antigen; detection; diagnosis; vaccine; tick-borne disease; 

KW differentiation; Lyme disease; ehrlichiosis; ss. 

OS Babesia microti. 

PH Key Location/Qualifiers 

FT CDS 1210. .2599 

FT /*tag- a 

FT /product- antigen 

PN EP-834567-A2. 

PD 08-APH998. ■+ 

PP 01-OCT-1997; 117067. * 

PR 24-APR-1997; US-845258. 

PR 01-OCM996; DS-723142. 

PA (C0RI-) CORIXA CORP. 

PI Houghton R, Lodes MJ, Reed SG, Sleath PR; 

DR WPI; 98-195465/18., 

DR P-PSDB; W56290. 

PT Polypeptides comprising Babesia microti antigens and their 

PT immunogenic fragments or epitopes - and related nucleic acid, 

PT vectors, transformed cells and antibodies, useful for diagnosis of 

PT infection and in protective vaccines 

PS Claim 8; Page 32-35; 113pp; English. 

CC The sequence is that encoding a polypeptide comprising at leasts 

CC one antigenic portion of a Babesia microti antigen . It can be used 

CC to diagnose B. microti infection by detecting specific antibodies 

CC in usual immunoassays. Infection can also be diagnosed using; 

CC (a) primers or probes derived from the coding sequence, in * 

CC standard amplification or hybridisation tests, or (b) using 

CC antibodies to detect the corresponding antigen. It is also 

CC useful in vaccines to protect against infection, especially 

CC when formulated with an adjuvant. The new diagnostic methods 

CC allow rapid differentiation between B. microti infection and 

CC other tick-borne diseases (Lyme disease and ehrlichiosis) that 

CC have similar symptoms but require different treatments. 

SQ Sequence 3701 BP; 1458 A; 457 C; 492 G; 1294 T; 



Query Match 1.8%; Score 101; DB 1; Length 3701; 

Best Local Similarity 47.4%; Pred. No. 6.1e-06; 

Matches 398; Conservative 0; Mismatches 435; Indels 6; Gaps 3; 

Qy 3200 ttatattacggaatgtaatattatattttaaaataaaattatgttatttagattcttaat 3259 

lllllll I II Mill I III I II III III II I I 
Db 369 TTATATTCATGTGGTTATAATTATAAAAGTATATATAGTTTTGTAATTGTAATGATATAA 428 

Qy 3260 attttggagcattccatactataatt — tcgtaacataatattaaaatatagtaatat 3315 

MM I II mill I II llll III lllll I III 
Db 429 AATTAGAACAGATATAATTAATAATTCAAATATTATATTAATTTTATTATATATGATTAT 488 

Qy 3316 aaagtgtaatta-actttaaattacaagcataatattaaattttgaatcaattaattttt 3374 

I II Ml II II M III I I III I II II 
Db 489 TATTGATATTTATATAATTACATATTGTTATTGTATCATTTAATGATTATATATCAATAT 548 

Qy 3375 atttctattattttaattaatttagtctattttttcaaaataaaatttaaatctaaataa 3434 

I III I Ml II I Mill III III I II I 
Db 549 CCATATATATATATAATAATTGAATTATAATTAAATTAATTGGCATATTACATTTATAAT 608 

Qy 3435 aaataatttttccttaatgttgaaacaactcatgttatacttcaaaattataagtattat 3494 

II Ml II I III I llll II Ml MM MM I II I 
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RESULT 10 
T70055 

ID T70055 standard; cDNA; 1283 BP, 

AC T70055; 

DT 20-AUG-1997 (first entry) 

DE Cotton fibre specific cDNA clone E9. 

KW cotton; E6; fibre; promoter; transgenic plant; truncated; 

RW heterologous gene expression; ds, 

OS Gossypium hirsutum strain Coker 312. 

PN US5620882-A. 

PD 15-APR-1997. 

PF 04-OCT-1988; 253243. 

PR 04-OCT-1988; US-253243. 

PR 21-NOV-1990; tJS-617239. 

PR 18-MAY-1992; US-885970. 

PR 19-OCT-1994; US-298829. 

PA (CETD ) AGRACETDS INC. 

PI John M; 

DR WPI; 97-235185/21. 

PT DNA constructs contg. truncated promoter sequence - for 

PT fibre-specific gene expression in cotton plants 

PS Example 3; Column 45-48; 48pp; English. 

CC T70040-57 are cotton fibre-specific cDNA clones which can be used to 

CC obtain genomic clones containing fibre-specific promoters. Claimed DNA 

CC constructs comprise a truncated promoter sequence (from one of T70031-38) 

CC that promotes preferential gene expression in plant fibre cells, a 

CC protein coding sequence not naturally associated with the promoter 

CC sequence and a 3' termination sequence. The DNA constructs are useful for 

CC expressing foreign genes in fibre-producing plants, esp. to produce 

CC transgenic cotton plants with varied cotton fibre characteristics and 

CC quality. The present sequence comprises E9 cDNA isolated from clone 

CC CKFB15-E9 (CK - Coker; FB15 ■ 15 day old bolls), 

SQ Sequence 1283 BP; 509 A; 233 C; 251 G; 290 T; 



Query Match 5.0%; Score 273.4; DB 1; Length 1283; 

Best Local Similarity 84.24; Pred. No. 2.4e-28; 

Matches 326; Conservative 0; Mismatches 46; Indels 15; Gaps 1; 

Qy 4124 aatacacgttcttttctttctatttgattaaccatggctcatagcattcgtcaccctttc 4183 

Db 13 ACTAAAAATTCTTTGOT 72 

Qy 4184 ttccttttccaacttttactcataagtgtctcactagtgaccggtagccacactgtttcg 4243 

11111111111111111111111 I IIIIMMII I I MUM Ml M Ml 
Db 73 TTCCTTTTCCAACTTTTACTCATTACTGTCTCACTAATAATCGGTAGTCACACCGTCTCG 132 

Qy 4244 gcagcggctcgacgtttattcgagacacaagcaacctcatcagagctcccacaattggct 4303 

MMMIMM! MIMM MMMM IMIIIMMIIMM 1 1 1 1 i 1 1 1 1 1 1 1 
Db • 133 TCAGCGGCTCGACATTTATTCCAGACACAAACAACCTCATCAGAGCTGCCACAATTGGCT 192 

Qy 4304 tcaaaatacgaaaagcacgaagagtctgaatacgaaaagccagaatacaaacagccaaag 4363 

1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 MMMM lllllllllll Mill 

Db 193 TCAAAATACGAAAAGCACAAAGAGTCT GAATACAAACAACCAAAA 237 

Qy 4364 tatcacgaagagtactcaaaacttgagaagcctgaaatgcaaaaggaggaaaaacaaaaa 4423 

MIMM III! Illlll MIMMMI MIMM Ml II II I II II MM 
Db 238 TATCACGAAAAGTACCCAAAACATGAGAAGCCTAAAATGCACAAGGAGGAAAAACAAAAA 297 

Qy 4424 ccctgcaaacagcatgaagagtaccacgagtcacacgaatcaaaggagcaaaaagagtac 4483 

milium mmmmmmmmimi mm mum imiimi 

Db 298 CCCTGCAAACATCATGAAGAGTACCACGAGTCACGCGAATCGAAGGAGCACGAAGAGTAC 357 



4484 gagaaagaaaatctcgacgggcccgaa 4510 

II 1 1 1 ! 1 1 1 1 I III III II 
358 GATAAAGAAAAACCCGATTTCCCCAAA 384 



RESULT 11 
T43361 

ID T43361 standard; cDNA; 974 BP. 
AC T43361; 



DT ll-MAR-1997 (first entry) 

DE Cotton FbLate 2-82A gene cDNA clone A8 (FbLate-1). 

KW FbLate; promoter; fibre; transgenic plant; cotton; ds, 

OS Gossypium hirsutum. 

PN WO9639021-A1. 

PD 12-DEC-1996. 

PF 06-JUN-1996; U09449. 

PR 06-JUN-1995; US-467504, 

PA (MONS ) MONSANTO CO, 

PI John ME; 

DR WPI; 97-042726/04. 

PT Plant fibre-specific, developmentally regulated FbLate promoter ■ 

PT useful for producing transgenic plants, esp. cotton, with altered 

PT fibre properties 

PS Claim 8; Page 55-56; 79pp; English. 

CC cDNA clones A8 or FbLate-1 (T43361) and All or FbLate-2 (T43362) 

CC correspond to RNAs prevalent in late development of cotton 

CC fibers. They were isolated from a 23 -day cotton fibre cDNA 

CC library by screening with 24 -day fibre cDNA, A8 and All are 

CC partial clones of the FbLate 2-82A gene. They can be used to 

CC identify FbLate promoters (see also T43360) useful for fibre- 

CC specific expression of foreign proteins in transgenic plants, esp. 

CC cotton fiber. 

SQ Sequence 974 BP; 388 A; 161 C; 222 G; 203 T; 



Query Match 3.8%; Score 210.5; DB 1; Length 974; - 

Best Local Similarity 68,1%; Pred, No, 3,6e-20; 

Matches 340; Conservative 1; Mismatches 145; Indels 13; Gaps 3; 

Qy 4311 acgaaaagcacgaagagtctgaatacgaaaagccagaatacaaacagccaaagtatcacg 4370 

I Mill I III I Ml MIMM M I II I I I II 

Db 446 AAGAAAAACCCGATTTCCCCAAATGGGAAAAGCCTAAAGGGCACGAGAAACATAAAGCCG 505 

Qy 4371 aagagtactcaaaacttgagaagcctgaaatgcaaaagg aggaaaaacaaaaac 4424 

II I II II 1 1 II Mil I I I I Mill I II I 

Db 506 AATATCCGAAAATACCTGAGTGCAAGGAAAAACTAGATGAGGATAAGGAACATAAACATG 565 

Qy 4425 cctgcaaacagcatgaagagtaccacgagtcacacgaatcaaaggagcaaaaagagtacg 4484 

I I I Illlll I I I III I III III 
Db 566 AGTTCCCAAAGCATGAAAAAGAAGAGGAGAAGAAACCTGAGAAAGGCATAGTACCCTGAG 625 

Qy 4485 agaaagaaaatctcgacgggcccgaagatcttcgctagccgtcgacgcccgggggaattc 4544 

I MM I I Illlll I I II I II II I I 
Db 626 TGGGTTAAAATGCCTGAATGGCCGAAGTCCATGTTTACTCAGTCTGGCTCGAG C 679 

Qy 4545 gtcgagccttgaatcatatgacgctggtgcatgtgccatcatcatgcagtaatttcatgg 4604 

iimii i mum iiiiiiiiiiiiiiiiiiiiiiiiiiiiimiiiiii 

Db 680 ACTAAGCCTTAAGCCATATGACACTGGTGCATGTGCCATCATCATGCAGTAATTTCATGG 739 

Qy 4605 tatatcgtaa-tatatagttaataaaaaagatggtgattgggaaatgtgtgtgtgcattc 4563 

MM Mil Mill IIIIIIIIIMIIIIIIIII 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 
Db 740 GATATTGTAATTATATTGTTAATAAAAAAGATGGTGAGTGGGAAATGTGTGTGTGCATTC 799 

Qy 4664 ctccatgcactaatggtgaatctctttgcatacatagaaattctaaatggttatagttta 4723 

iiiiii i mi iiiiiiiiiimii mill urn imimmim 

Db 800 ATCCATGTAGCAATGCTGAATCTCTTTGCATGCATAGAGATTCTGAATGGTTATAGTTTA 859 
Qy 4724 tgttatagtgtatgttgtagtgaaaktaattttaaatgttgtatctaatgttaacatcac 4783 

iiMiii ii mi iiiiiiimiiiiii 1 1 ; 1 1 1 1 1 1 1 minimum 

Db 860 TGTTATATCGTTTGTTCTAGTGAAATTAATTTTGAATGTTGTATGTAATGTTAACATCAC 919 



Qy 4784 ttggcttgatttatgttat 4802 

mmmiiiimi I 

Db 920 TTGGCTTGATTTATGTTTT 938 



RESULT 12 
T43362 

ID T43362 standard; cDNA; 645 BP. 
AC T43362; 

DT ll-MAR-1997 (first entry) 

DE Cotton FbLate 2-82A gene cDNA clone All (FbLate-2). 
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Db 61 TTACTCATTACTGTCTCACTAATGATCGGTAGCCACACCGTCTCGTCAGCGGCTCGACAT 120 
Qy 4259 ttattcgagacacaagcaacctcatcagagctcccacaattggcttcaaaatacgaaaag 4318 

nun i nun iimiiiimmi iiiiiiiimiiiiiiiiiimiii 

Db 121 TTATTCCACACACAAACAACCTCATCAGAGCTGCCACAATTGGCTTCAAAATACGAAAAG 180 
Qy 4319 cacgaagagtctgaatacgaaaagccagaatacaaacagccaaagtatcacgaagagtac 4378 

illinium inmiiiimmi urn iiimiii 

Db 181 CACGAAGAGTCT GAATACAAACAGCCAAAATATCATGAAGAGTAC 225 

Qy 4379 tcaaaacttgagaagcctgaaatgcaaaaggaggaaaaacaaaaaccctgcaaacagcat 4438 

linn iiiiiiiiiimiii i imiiiiiiiiiiiiiiiimmm in 

Db 226 CCAAAACATGAGAAGCCTGAAATGTACAAGGAGGAAAAACAAAAACCCTGCAAACATCAT 285 
Qy 4439 gaagagtaccacgagtcacacgaatcaaaggagcaaaaagagtacgagaaagaaaatctc 4498 

iiiimmimiiiii nun mini mmim mum i i 

Db 286 GAAGAGTACCACGAGTCACGCGAATCGAAGGAGCACGAAGAGTACGATAAAGAAAAACCC 345 

Qy 4499 gacgggcccgaa 4510 

II III II 
Db 346 GATTTCCCCAAA 357 



RESULT 6 
T43366 

ID T43366 standard; DNA; 519 BP. 

AC T43366; 

DT ll-MAR-1997 (first entry) 

DE Cotton FbLate 2-82A gene cDNA clone All amplified fragment. 

KW FbLate; promoter; fibre; transgenic plant; cotton; 

KW Gossypium hirsutum; ds, 

OS Synthetic. 

PN WO9639021-A1. 

PD 12-DEC-1996. 

PF 06-JUN-1996; U09449. 

PR 06-JUN-1995; US-467504, 

PA (MONS ) MONSANTO 'CO. 

PI John ME; 

DR WPI; 97-042726/04, 

PT Plant fibre-specific, developmental^ regulated FbLate promoter - - 

PT useful for producing transgenic plants, esp, cotton, with altered 

PT fibre properties 

PS Example 5; Page 63; 79pp; English. 

CC A DNA clone (T43366) was generated by 5 'RACE using primers (see 

CC also T43364-65) based on FbLate2 clone All (T43362), a partial 

CC cDNA clone corresponding to mRNA prevalent in the late development 

CC of cotton fibre. The RACE product showed 91.64 similarity at the 

CC nucleotide level to the genomic clone, FbLate2-82A (see also 

CC T43360). The homology of the RACE product started from nucleotide 

CC position 2269 of the FbLate2-28A gene. The ATG initiation codon 

CC was identified at position 2315 of the gene. 

519 BP; 191 A; 127 C; 87 G; 114 T; 



Query Match 5.0%; Score 275; DB 1; Length 519; 

Best Local Similarity 85.5%; Pred. No. 1.6e-28; 

Matches 324; Conservative 0; Mismatches 40; Indels 15; Gaps 1; 

Qy 4132 ttcttttctttctatttgattaaccatggctcatagcattcgtcaccctttcttcctttt 4191 

millllllllllllM 1 1 M II M 1 1 1 1 1 1 M I II MM mimimiii 

Db 80 TTCTTTTCTTTCTATTTGGTTAACCATGGCTCATAACTTTTGTCATCCTTTCTTCCTTTT 139 
Qy 4192 ccaacttttactcataagtgtctcactagtgaccggtagccacactgtttcggcagcggc 4251 

> iiiiimmmi i 1 1 1 1 1 1 1 1 1 1 i i nun urn n in mini 

Db 140 CCAACTTTTACTCATTACTGTCTCACTAATAATCGGTAGTCACACCGTCTCGTCAGCGGC 199 
Qy 4252 tcgacgtttattcgagacacaagcaacctcatcagagctcccacaattggcttcaaaata 4311 

inn mini mum imniiimmi miimmimimi 

Db 200 TCGACATTTATTCCAGACACAAACAACCTCATCAGAGCTGCCACAATTGGCTTCAAAATA 259 
Qy 4312 cgaaaagcacgaagagtctgaatacgaaaagccagaatacaaacagccaaagtatcacga 4371 

miiiim mini minimi inn mini 

Db • 260 CGAAAAGCACAAAGAGTCT GAATACAAACAACCAAAATATCACGA 304 



Qy 4372 agagtactcaaaacttgagaagcctgaaatgcaaaaggaggaaaaacaaaaaccctgcaa 4431 

i i iii nun 1 1 1 1 1 1 1 1 ! i mini innimmmmiinim 

Db 305 AAACTACCCAAAACATGAGAAGCCTAAAATGCACAAGGAGGAAAAACAAAAACCCTGCAA 364 
Qy 4432 acagcatgaagagtaccacgagtcacacgaatcaaaggagcaaaaagagtacgagaaaga 4491 

ill ininnnnnninn nun mmii mmim inn 

Db 365 ACATCATGAAGAGTACCACGAGTCACGCGAATCGAAGGAGCACGAAGAGTACGATAAAGA 424 

Qy 4492 aaatctcgacgggcccgaa 4510 

III I III III II 
Db 425 AAAACCCGATTTCCCCAAA 443 



RESULT 7 
T13048 

ID T13048 standard; cDNA; 1283 BP. 

AC T13048; 

DT 27-MAY-1996 (first entry) 

DE Cotton fibre-specific cDNA clone E9. 

KW Cotton; fibre; promoter; transgenic plant; crop improvement; ds. 

OS Gossypium hirsutum strain Coker 312. 

PN US5495070-A. 

PD 27-FEB-1996. 

PF 04-OCM988; 253243. 

PR 04-OCM988; US-253243, 

PR 21-NOV-1990; US-617239, 

PR 18-MAY-1992; DS-885970. 

PA (CETU ) AGRACETUS INC. 

PI John M; 

DR WPI; 96-139095/14. 

PT New isolated fibre-specific promoters - used for introducing 

PT altered fibre-specific characteristics into plants, partic. cotton. 

PS Example 3; Column 45-46; 48pp; English. 

CC Cotton cDNA clone E9 (T13048) was isolated from a cDNA library of 

CC cotton var.. Coker 312 15-dayold boll cells using a subtractive 

CC hybridization procedure, The clone hybridises strongly to fiber 8 - 

CC RNA and weakly to petal RA. E9 and other fibre-specific cDNA clones 

CC (see T13033-47 and T13049-T13050) were used to screen cotton genomic 

CC libraries, leading to the isolation of genomic clones (see T13025-32 

CC and T13052-53) contg. sequences capable of promoting gene expression 

CC in fibre cells. 

SQ Sequence 1283 BP; 509 A; 233 C; . 251 G; 290 T; - r 



Query Match 5,0%; 
Best Local Similarity 84.2%; 
Matches 326; Conservative 



Score 273.4; DB 1; Length 1283; 

Pred. No. 2.4e-28; 

); Mismatches 46; Indels 15; 



Qy 4124 aatacacgttcttttctttctatttgattaaccatggctcatagcattcgtcaccctttc 4183 

i n i mm miinni i imminim i n in nun 

Db 13 ACTAAAAATTCTTTGCTTTCTATTTTGTAAACCATGGCTCATAACTTTTGTCATCCTTTC 72 
Qy 4184 ttccttttccaacttttactcataagtgtctcactagtgaccggtagccacactgtttcg 4243 

immiiiiiiiiniimi i iimiini i i mm in n in 

Db 73 TTCCTTTTCCAACTTTTACTCATTACTGTCTCACTAATAATCGGTAGTCACACCGTCTCG 132 
Qy 4244 gcagcggctcgacgtttattcgagacacaagcaacctcatcagagctcccacaattggct 4303 

minium nnn nun iiiimimiini niiiiniin 

Db 133 TCAGCGGCTCGACATTTATTCCAGACACAAACAACCTCATCAGAGCTGCCACAATTGGCT 192 
Qy 4304 tcaaaatacgaaaagcacgaagagtctgaatacgaaaagccagaatacaaacagccaaag 4363 

iiiiimiiiimiii nnn miniiin inn 

Db 193 TCAAAATACGAAAAGCACAAAGAGTCT GAATACAAACAACCAAAA 237 

Qy 4364 tatcacgaagagtactcaaaacttgagaagcctgaaatgcaaaaggaggaaaaacaaaaa 4423 

mini nm mm mmim nnn iinninninnn 

Db' 238 TATCACGAAAAGTACCCAAAACATGAGAAGCCTAAAATGCACAAGGAGGAAAAACAAAAA 297 
Qy 4424 ccctgcaaacagcatgaagagtaccacgagtcacacgaatcaaaggagcaaaaagagtac 4483 

minimi iimiiimiiimmn nnn mini nnn 

Db 298 CCCTGCAAACATCATGAAGAGTACCACGAGTCACGCGAATCGAAGGAGCACGAAGAGTAC 357 
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Db 121 TTTTATTTTGGTTTTGGGTTTTGTTGAGTTTTTTAGATMTTATTTTAAATATTCTGCAT 180 

Qy 1918 aatttttctgttatttgaaaaggatgttcgaattttttttcaaaattgaaacgtttaaga 1977 

Db 181 AATITTTCTGTTATTTGAAAAGGATGTTCGAATTTTTTTTCAAAATTGAAACGITTAAGA 240 

Qy 1978 atttttactactgcaaattcagaataagtgaatttgttttttagaaagattaaataagtt 2037 

Db 241 ATTTTTACTACTGCAAATTCAGAATAAGTGAATTTGTTTTTTAGAAAGATTAAATAAGTT 300 

Qy 2038 agtattacgatttttagtttgatttggtggaaagtaatgtatgtttttgaacataattat 2097 

Db 301 AGTATTACGATTTTTAGTTTGATTTGGTGGAAAGIAATGTATGTTTTTGAACATAATTAT 360 

Qy 2098 ttgacaataattaagttttctagggaataaacggaaatatcttc-ttcttttttgtaaaa 2156 

Qy 2157 ttactaatgcaagaacaaacaacgttttggggagcaaataatctagctttaagtagtcag 2216 

Db 421 TTACTAATGCAAGAACAAACAACGTTTTGGGAAGCAAATAATCTAGCTTTAAGTAGTCAG 480 

Qy 2217 tgtaactctcaaaatctggtcataacttctaggctgagtttgctgtgctacagtagtaag 2276 

Db 481 TGTAACTCTCAAAATCTGGTCATAACTTCTAGGCTGAGTTTGCTGTGCTACAGTAGTAAG 540 

Qy 2277 tctatagaaacttacctgacaaaacgacatgacgtcagggtcgaatctacaacttttcct 2336 

Db 541 TCTATAGAAACTTACCTGACAAAACGACATGACGTCAGGGTCGAATCTACAACTITTCCT 600 

Qy 2337 ttttcttcaattaacatatggttgattcaagttccgatctataataatttattacgattt 2396 

Db 601 TTTTCTTCAATTAACATATGGTTGATTCAAGTTCCGATCTATAATAATTTATTACGATTT 660 

Qy 2397 atcaatttcaattaccttatatcatcctattataaatataagtcagttcaattcagtttt 2456 

Db 661 ATCAATTTCAATTACCTTATATCATCCTATTATAAATATAAGTCAGTTCAATTCAGTTTT 720 

Qy 2457 cgaaagttcccaaaaattttgaattttattaaatttattccctaaaaccgaaatagttat 2516 

minimi iiiiiiiiiiiiiimimiiiiimimiiimiiim n 

Db 721 CGAAAGTTCCCTAAAATTTTGAATTTTATTAAATTTATTCCCTAAAACCGAAATAGTGAT 780 

Qy 2517 atctttcaaatttaagtttcatttttcaatccgatttcaatttcatccttttataactct 2576 

Db 781 ATCTTTCAAATTTAAGTTTCATTTTTCAATCCGATTTCAATTTCATCCTTTTATAACTCT 840 

Qy 2577 ctattatctataattacataaatttcaaattaattttgaaatatttacactttagtccct 2636 

Db 841 CTATGATCTATAATTACATAAATTTCAAACTAATTTTGAAATATATACACTTTAGTCCCT 900 

Qy 2637 aagttcaaaactataaattttcactttagaaattaatcatttttcacatctaagcatcaa 2696 

Db 901 AAGTTCAAAACTATAAATTTTCACTTTAGAAATTAATCATTTTTCACATCTAAGCATCAA 960 

Qy 2697 atttaaccaaatgacacaaatttcatgattagttagatcaagcttttgagtcttcaaaac 2756 

Db 961 ATTTAACCAAATGACACAAATTTCATGATTAGTTAGATCAAGCTTTTGAGTCTTCAAAAA 1020 

Qy 2757 ataaaaatt----acaaaaaaaaaacaaacttaaaatcatttatcaatttgaacaacaaa 2812 

mi i i iiiiiiiiiiiiiiiiiiiiiiimiimiiiiiiimiiii 

Db 1021 CATAAAAATTACAAAAAAAAAAAAACAMCTTAAAATCATTTATCAATTTGAACAACAAA 1080 

Qy 2813 gcttggccgaatgctaagagcttaaaaatggcttcttttgtttctttttgttgcaaacgg 2872 

Qy 2873 tggagagaagagggaaatgaagattgaccatatttttttattatgttttaacatataata 2932 

Db 1141 TGGAGAGAAGAGGGAAATGAAGATTGACCATATTTTTTTATTATGTTTTAACATATAATA 1200 

Qy 2933 ttaataatttaatcataattatactttggtgaatgtgacagtggggagatacgtaaagta 2992 

Db 1201 TTAATAATTTAATCATAATTATACTTTGGTGAATGTGACAGIGGGGAGATACGTAAAGTA 1260 



2993 ttttaacattatactttttgcaagcagttggctggtctacccaagagtgatcaaagtttg 3052 

1261 -TATAACATTATACTTTTTGCAAGCAGTTGGCTGGTCTATCCAAGAGTGATCAAAGTTTG 1319 

3053 agctgccttcaatgagccaatttttgcccataatggataaaggcaatttgtttagttcaa 3112 

1320 AGCTGCCTTCAATGAGCCAATTTTTGCCCATAATGGATAAAGGCAATTTGTTTAGTTCAA 1379 

3113 ctgctcacagaataatgttaaaatgaaattaaaataaggtggcctggtcacacacac- - - 3169 

1380 CTGCTCACAGAATAATGITAAAATGAAATTAAAATAAGGTGGCCTGGTCACACACACACA 1439 

3170 aaaaaaaaactaatgttggttggttgaattttatattacggaatgtaatattatatttta 3229 

1440 AAAAAAAAACTAATGTTGGTTGGTTGAATTTTATATTACGGAATGTAATGTTATATTTTA 1499 

3230 aaataaaattatgttatttagattcttaatattttggagcattccatactataatttcgt 3289 

1500 AAATAAAATTATGTTATTTAGATTCTT AAT ATTTT ■ GAGCATTCCATACT AT AATCTCGT 1558 

3290 a ■ acataatattaaaatatagtaatataaagtgtaattaactttaaattacaagcataat 3348 

1559 ATACATAATATTAAAATATAGTAATATAAAGTGTAATTAACTTTAAATTACAAGCATAAT. 1618 

3349 attaaattttgaatcaattaatttttatttctattattttaattaatttagtctattttt 3408 

1619 ATTAAATTTTGAATCAATIAATTTTTATTTCTATTATTTTAATTAATTTAGTCTATTTTT 1678 

3409 tcaaaataaaatttaaatctaaataaaaataatttttccttaatgttgaaacaactcatg 3468 

1679 TCAAAATAAAATTTAAATCTAAATAAAAATAATTTTTCCTTAATATT 1725 

3469 ttatacttcaaaattataagtattatatttaccttgatgatttatttattagtatattaa 3528 

1726 ATTAATAAATTTATTTCAACATCATATATTTACTTATTAATACATAAA 1773 

3529 ttctgattataattatggtgggatacaatcgctttccactaaatattttaactatgattt 3588 
II I 

1774 TTAT 1777 

3589 ataaatttatttcaacatcgtatatttacttattaatacataatttatcataattttatg, 3648 

1778 AATAATTTATCATAATTTTATG 1799 

3649 gaaattgagaccaagaaacattaagagaacaaattctataacaaagacaatttagaaaaa 3708 
1800 GAAATTGAGACCAAGAAACATT AAGAGAACAAATTCTATAACAAAGACAATTTAG - TAAA 1858 
3709 aatgtacttttaggtaattttaagtactcttaaccaaacacaaaaattcaaatcaaatga 3768 
1859 AATGTACTITTAGGTAATTTTAAGTACTCTTAACCAAACACAAAAAITCAAATCAAATGA 1918 
3769 actaaataagataatataacatacggaacatcttacttgtaatcttacattcccataatt 3828 



II 

1919 ACCAAATAAGATAATATAACATACAGAATATCCTACTTGTATTCTTACATTCCCGTAATC 1978 

3829 ttattatgaaaaataatcttatattactcgaactaaatgttgtcacaaattattatctaa 3888 

1979 ATATTATGAAAAGTAATATTATATTACCTGAGCCAAATGCTCTCACAAACTATTATCCAA 2038 

3889 ataaagaa-aaacacttaatttttataacattttttcatatatttgaaagattatattt 3946 

2039 AAAAAAAATGTTGAAIATAATTITTATAACATTTTTTCATATATTTGCAAGATTATATTT 2098 

3947 tgtatatttacgtaaaaatatttgacatagattgagcaccttcttaacataatcccacca 4006 

2099 TGTATATTTACGTAAAAATATTTGACATAGATTGAACACCTTCTTAACATAATCCCACCA 2158 

4007 taagtcaagtatgtagatgagaaattggtacaaacaacgtggggccaaatcccaccaaac 4066 

2159 TAAGICAAGTATGTAGATGAGAAATTGGTACAAACAACGTGGGGCCAAATCCCACCAAAC 2218 
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Db 


2041 


AAGTTAGTATTACGRTTTTTAGTTTGATTTGGTGGAAAGTAATGTATGTTTTTGAACATA 


2100 


Qy 


2093 


attatttgacaataattaagttttctagggaataaacggaaatatcttcttcttttttgt 


2152 


Db 


2101 


ATTATTTGACAATAATTAAGTTTTCTAGGGAATAAACGGAAATATCTTCTTCTTTTTTGT 


2160 


Qy 


2153 


aaaattactaatgcaagaacaaacaacgttttggggagcaaataatctagctttaagtag 


2212 


Db 


2161 


AAAATTACTAATGCAAGAACAAACAACGTTTTGGGGAGCAAATAATCTAGCTTTAAGTAG 


2220 


Qy 


2213 


tcagtgtaactctcaaaatctggtcataacttctaggctgagtttgctgtgctacagtag 


2272 


Db 


2221 


TCAGTGTAACTCTCAAAATCTGGTCATAACTTCTAGGCTGAGTTTGCTGTGCTACAGTAG 


2280 


Qy 


2273 


taagtctatagaaacttacctgacaaaacgacatgacgtcagggtcgaatctacaacttt 


2332 


Db 


2281 


TAAGTCTATAGAAACTTACCTGACAAAACGACATGACGTCAGGGTCGAATCTACAACTTT 


2340 


Qy 


2333 


tcctttttcttcaattaacatatggttgattcaagttccgatctataataatttattacg 


2392 


Db 


2341 


TCCTTTTICTTCAATTAACATATGGTIGATTCAAGTTCCGATCTATAAIAA1TTATTACG 


2400 


Qy 


2393 


atttatcaatttcaattaccttatatcatcctattataaatataagtcagttcaattcag 


2452 


Db 


2401 


ATTTATCAATTTCAATTACCTTATATCATCCTATTATAAATATAAGTCAGTTCAATTCAG 


2460 


Qy 


2453 


ttttcgaaagttcccaaaaattttgaattttattaaatttattccctaaaaccgaaatag 


2512 


Db 


2461 


TTTTCGAAAGTTCCCAAAAATTTTGAATTTTATTAAATTTATTCCCTAAAACCGAAATAG 


2520 


Qy 


2513 


ttatatctttcaaatttaagtttcatttttcaatccgatttcaatttcatccttttataa 


2572 


Db 


2521 


TTATATCTTTCAAATTTAAGTTTCATTTTTCAATCCGATTTCAATTTCATCCTTTTATAA 


2580 


Qy 


2573 


ctctctattatctataattacataaatttcaaattaattttgaaatatttacactttagt 

IIIIMIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIM! 


2632 


Db 


2581 


CTCTCTATTATCTATAAITACATAAATTTCAAATTAATTTTGAAATATTTACACTTTAGT 


2640 


Qy 


2633 


ccctaagttcaaaactataaattttcactttagaaattaatcatttttcacatctaagca 


2692 


Db 


2641 


CCCTAAGTTCAAAACTATAAATTTTCACTTTAGAAATTAATCATTTTTCACATCTAAGCA 


2700 


Qy 


2693 


tcaaatttaaccaaatgacacaaatttcatgattagttagatcaagcttttgagtcttca 


2752 


Db 


2701 


TCAAATTTAACCAAATGACACAAATTTCATGATTAGTTAGATCAAGCTTTTGAGTCTTCA 


2760 


Qy 


2753 


aaacataaaaattacaaaaaaaaaacaaacttaaaatcatttatcaatttgaacaacaaa 


2812 


Db 


2761 


AAACATAAAAATTACAAAAAAAAAACAAACTTAAAATCATTTATCAATITGAACAACAAA 


2820 


Qy 

Db 


2813 
2821 


gcttggccgaatgctaagagcttaaaaatggcttcttttgtttctttttgttgcaaacgg 
GCTTGGCCGAATGCTAAGAGCTTAAAAATGGCTTCTTTTGTTTCTTTTTGTIGCAAACGG 


2872 
2880 


Qy 


2873 


tggagagaagagggaaatgaagattgaccatatttttttattatgttttaacatataata 


2932 


Db 


2881 


TGGAGAGAAGAGGGAAATGAAGATTGACCATATTTTT1TATTATGTTTTAACAIATAATA 


2940 


Qy 


2933 


ttaataatttaatcataattatactttggtgaatgtgacagtggggagatacgtaaagta 


2992 


Db 


2941 


TTAATAATTTAATCATAATTATACTTTGGTGAATGTGACAGTGGGGAGATACGTAAAGTA 


3000 


Qy 


2993 


ttttaacattatactttttgcaagcagttggctggtctacccaagagtgatcaaagtttg 


3052 


Db 


3001 


TTTTAACATTATACTTTTTGCAAGCAGTTGGCTGGTCTACCCAAGAGTGATCAAAGTTTG 


3060 


Qy 


3053 


agctgccttcaatgagccaatttttgcccataatggataaaggcaatttgtttagttcaa 


3112 


Db 


3061 


AGCTGCCTTCAATGAGCCAATTTTTGCCCATAATGGATAAAGGCAATTTGTTTAGTTCAA 3120 


Qy 

i 


3113 


ctgctcacagaataatgttaaaatgaaattaaaataaggtggcctggtcacacacacaaa 3172 

J 1 1 1 1 1 1 1 I.I 1 1 1 1 1,1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 III 1 1 1 1 



Db 


3121 


CTGCTCACAGAATAATGTTAAAATGAAAT1AAAATAAGGTGGCCTGGTCACACACACAAA 


3180 


Qy 


3173 


aaaaaactaatgttggttggttgaattttatattacggaatgtaatattatattttaaaa 




Db 


3181 


AAAAAACTAATGTTGGTTGGTTGAATTTTATATTACGGAATGTAATATTATATTTTAAAA 


3240 


Qy 


3233 


taaaattatgttatttagattcttaatattttggagcattccatactataatttcgtaac 




Db 


3241 


TAAAATTATGTTATTTAGATTCTTAATATTTTGGAGCATTCCATACTATAATTTCGTAAC 


3300 


Qy 


3293 


ataatattaaaatatagtaatataaagtgtaattaactttaaattacaagcataatatta 




Db 


3301 


ATAATATTAAAATATAGTAATATAAAGTGTAATTAACTTTAAATTACAAGCATAATATTA 


3360 


Qy 


3353 


aattttgaatcaattaatttttatttctattattttaattaatttagtctattttttcaa 


3412 


Db 


3361 


AATTTTGAATCAATTAATTTTTATTTCTATTATTTTAATTAATTTAGTCTATTTTTTCAA 


3420 


Qy 


3413 


aataaaatttaaatctaaataaaaataatttttccttaatgttgaaacaactcatgttat 




Db 


3421 


AATAAAATTTAAATCTAAATAAAAATAAITTTTCCTTAATGTTGAAACAACTCATGTTAT 


3480 


Qy 


3473 


acttcaaaattataagtattataLitaccttgatgatttatttattagtatattaaiLCL 


3532 


Db 


3481 


ACTTCAAAATTATAAGTATTATATTTACCTTGATGATTTATTTATTAGTATATTAATTCT 


3540 


Qy 


3533 


gattataattatggtgggatacaatcgctttccactaaatattttaactatgatttataa 




Db 


3541 


GATTATAATTATGGTGGGATACAATCGCITTCCACTAAATATTTTAACTATGATTTATAA 


3600 


Qy 


3593 


atttatttcaacatcgtatatttacttattaatacataatttatcataattttatggaaa 


3652 


Db 


3601 


ATTTATTTCAACATCGTATATTTACTTATTAATACATAATTTATCATAATTTTATGGAAA 


3660 


Qy 


3653 


ttgagaccaagaaacattaagagaacaaattctataacaaagacaatttagaaaaaaatg 


3712 


Db 


3661 


ITGAGACCAAGAAACATTAAGAGAACAAATTCTATAACAAAGACAATTTAGAAAAAAATG 


3720 


Qy 


3713 


tacttttaggtaattttaagtactcttaaccaaacacaaaaattcaaatcaaatgaacta 




Db 


3721 


TACTTTTAGGTAATTTTAAGTACTCTTAACCAAACACAAAAATTCAAATCAAATGAACTA 


3780 


Qy 


3773 


aataagataatataacatacggaacatcttacttgtaatcttacattcccataattttat 




Db 


3781 


AATAAGATAATATAACATACGGAACATCTTACTTGIAATCTTACATTCCCATAATTTTAT 


3840 


Qy 


3833 


tatgaaaaataatcttatattactcgaactaaatgttgtcacaaattattatctaaataa 




Db 


3841 


TATGAAAAATAATCTTATATTACTCGAACTAAATGTTGTCACAAATTATTATCTAMTAA 


3900 


Qy 


3893 


agaaaaacacttaatttttataacattttttcatatatttgaaagattatattttgtata 


3952 


Db 


3901 


AGAAAAACACTTAATTTTTATAACATTTTTTCATATATTTGAAAGATTATATTTTGTATA 


3960 


Qy 


3953 


tttacgtaaaaatatttgacatagattgagcaccttcttaacataatcccaccataagtc 




Db 


3961 


TTTACGTAAAAATATTTGACATAGATTGAGCACCTTCTTAACAIAATCCCACCATAAGTC 


4020 


Qy 


4013 


aagtatgtagatgagaaattggtacaaacaacgtggggccaaatcccaccaaaccatctc 




Db 


4021 


AAGTATGTAGATGAGAAAITGGTACAAACAACGTGGGGCCAAATCCCACCAAACCATCTC 


4080 


Qy 


4073 


tcattctctcctataaaaggcttgctacacatagacaacaatccacacacaaatacacgt 


4132 


Db 


4081 


TCATTCTCTCCTATAAAAGGCTTGCTACACATAGACAACAATCCACACACAAATACACGT 


4140 


Qy 


4133 


tcttttctttctatttgattaaccatggctcatagcattcgtcaccctttcttccttttc 


4192 


Db 


4141 


TCTTTTCTTTCTATTTGATTAACCATGGCTCATAGCATTCGTCACCCTTTCTTCCTTTTC 


4200 


Qy 


4193 


caacttttactcataagtgtctcactagtgaccggtagccacactgtttcggcagcggct 


4252 


Db 


4201 


CMCTTTTACTCATAAGTGTCTCACTAGTGACCGGTAGCCACACTGTTTCGGCAGCGGCT 


4260 
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Qy 


4561 


Db 


4561 


Qy 


4621 


Db 


4621 


Qy 


4681 


Db 


4681 


Qy 


4741 


Db 


4741 


Qy 


4801 


Db 


4801 


Qy 


4861 


Db 


4861 


Qy 


4921 


Db 


4921 


Qy 


4981 


Db 


4981 


Qy 


5041 


Db 


5041 


Qy 


5101 


Db 


5101 


Qy 


5161 


Db 


5161 


Qy 


5221 


Db 


5221 


Qy 


5281 


Db 


5281 


Qy 




Db 


5341 


Qy 


5401 


Db 


5401 


Qy 


5461 


Db 


5461 



IIIIIIMIIIIIIIIIIIMIMIIMIIIIIIIMIIIIIIIIIIIIIIIIIIIIIII 



i;iM!!lllllll!ll!l!ll!IIIMII![llllllll!l!ll!ll!ll!lll!lll! 



Illlllllllllllllllllllllllllllllllllllllllllimilimilllll 



IIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIMI 



ATGTTATGTATTTTACTTTAATGATATTGCATGTATTGTTAATTTAACATTGCTTGATCA 4 



lllll!!l!IIIIIM!llll!l!llllll!lllllllll!lllllll!!lll!!!llll 



iiiiiiimiiiiimiiiiiiimiimimimiiiiiiiiiiimiiiii 



IIIIIIIIIMIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMI 
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MIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIMIIIIIIIIIII 
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RESULT 2 
T73865 

ID T73865 standard; DNA; 5547 BP. 
AC T73865; 

DT 26-JAN-1998 (first entry) 

DE Cotton fibre promoter clone 4-4(6) construct, pCGN5606 (Version I). 
KW promoter; fibre-specific; transcriptional factor; promoter; 
KW altered phenotype; .colour; melanin; indigo; ss. 
, cS*- - - /)* 



OS Gossypium hirsutum cv, coker 130. 

FH Key Location/Qualifiers 

FT miscjeature 1, ,65 

FT /*tag- a 

FT /note- "fragment of pBluescriptll polylinker (as 

FT stated in the specification)" 

FT miscjeature 57, ,5494 

FT /*tag- b 

FT /note- "genomic clone 4-4(6) from lambda phage clone of 

FT a cotton Coker 130 genomic library (as stated in 

FT the specification)" 

FT miscJNA 65. .4163 

FT /*tag- c 

FT /note- "5' flanking region of the 4-4(6) gene (as 

FT stated in the specification)" 

FT CDS 4163. .4502 

FT /*tag- d 

FT /note- "corresponds to part of the 4-4(6) ORF (as 

FT stated in the specification)" . 

FT CDS complement (4131. .4502) 

FT /*tag- i 

FT /transl_except- (pos:4170. ,4172, aa:Xaa) - 

FT /transl.except- (pos:4182. .4184, aa:Xaa) 

FT /note- "Xaa - stop codon; No start or stop codons* 

FT given, possibly conforms to exon structure. 

FT Encodes W21899" 

FT miscjeature 4502. .4555 

FT /*tag- e 

FT /note- "synthetic polylinker oligonucleotide containing 

FT unique target sites for EcoRI, Smal, Sail, Nhel 

FT and Bglll* 

FT miscjeature 4163. .4555 

FT /*tag- f 

FT /note- " stuff er fragment left in place to facilitate the 

FT monitoring of cloning manipulations (as ..stated in 

FT the specification)" "* 

FT 3'DTR 4555, .5494 '.t 

FT /*tag- g 

FT /note- "corresponds to the 940 nucleotides downstream of 

FT the stop codon and constitutes the 3' flanking ■ 

FT region of the 4-4(6) gene (as stated in 'the 

FT specification)" 

FT miscjeature 5494, ,5547 

FT /*tag- h 

FT /note- "fragment of pBluescriptll polylinker (as stated 

FT in the specification)" 

PN WO9640924-A2. 

PD 19-DEM996. 

PF 07-JUN-1996; 009897. 

PR 07-JUN-1995; DS-480178. 

PR 01-JUL-1996; ZA-005572. 

PA (CALJ ) CALGENE INC, 

PI Mcbride K, Pear JR, Perez-Grau L, Stalker DM; 

DR WPI; 97-052325/05. 

DR P-PSDB; W21899. 

PT DNA construct contg, gene of interest controlled by cotton fibre 

PT transcriptional factor ■ used to produce altered phenotype cotton 

PT fibre cells expressing genes affecting pigmentation 

PS Claim 22; Fig 2A-J; 95pp; English. 

CC The present sequence is a 4-4 cotton fibre expression cassette (version 

CC I) from promoter construct pCGN5606 . The lambda genomic phage clone used 

CC to form this construct was designated 4-4(6). DNA constructs containing 

CC cotton fibre-specific transcriptional factor promoters are useful to 

CC produce cotton fibre cells with altered phenotype, especially altered 

CC colour, Genes involved in the production of melanin (e.g. tyrosinase 

CC gene and ORF438 encoded protein from Streptomyces antibioticus) and 

CC indigo (mono-oxygenase genes possibly in conjunction with a 

CC tryptophanase gene) are of interest, The promoters of the invention are 

CC reliable and permit expression of a protein selectively in cotton fibre 

CC to affect qualities such as fibre strength, length, colour and dyability 

CC as required. The construct and methods can also be used for the 

CC introduction of other advantageous genes into a cotton plant, e.g. a 
CC , plant hormone. In particular, fibres from a plant producing coloured 
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Qy 181 ctgatttacatcctttatataggctgaaactacaacaactttagctaaaaaaataggata 240 

Db 181 CTGATTTACATCCTTTATATAGGCTGAAACTACAACAACTTTAGCTAAAAAAATAGGATA 240 

Qy 241 acctaatagcaaaatcacaatcagatattaaaccatgattttagctaaccatttaacaac 300 

Db 241 ACCTAATAGCAAAATCACAATCAGATATTAAACCATGATTTTAGCTAACCATTTAACAAC 300 

Qy 301 tttattgaaactaatttgaatatttcatctgctgatatgcccaagattttaggccactaa 360 

Db 301 TTTATTGAAACTAATTTGAATATTTCATCTGCTGATATGCCCAAGATTTTAGGCCACTAA 360 

Qy 361 ccgatttggtggtgaactttaacatgtcatgcatttgtaactgtttgaaacaagtttttt 420 

Db 361 CCGATTTGGTGGTGAACTTTAACATGTCATGCATTTGTAACTGTTTGAAACAAGTTTTTT 420 

Qy 421 gcattattttactatatgaactgtttgattaggttgagttacacactgagcttgtaagct 480 

Db 421 GCATTATTTTACTATATGAACTGTTTGATTAGGTTGAGTTACACACTGAGCTTGTAAGCT 480 

Qy 481 cactcaaatttttctaatttctaaggtgatcagcaaacttaggaccgggcggcgtacgag 540 

Db 481 CACTCAAATTTTTCTAATTTCTAAGGTGATCAGCAAACTTAGGACCGGGCGGCGTACGAG 540 

Qy 541 agctcggattgattttctagttaataaataagacgatttatgtttttaaactattatgga 600 

Db 541 AGCTCGGATTGATTTTCTAGTTAATAAATAAGACGATTTATGTTTTTAAACTATTATGGA 600 

Qy 601 ctttttggactatgtaactgtttgggactttatttttgttttttatttgctttttttgga 660 

Db 601 CTTTTTGGACTATGTAACTGTTTGGGACTTTATTTTTGTTTTTTATTTGCTTTTTTTGGA 660 

Qy 661 tttagtaattattatttttaaactgcaaaattatatgtttttacaaactaagtcacagtt 720 

Db 661 TTTAGTAATTATTATTTTTAAACTGCAAAATTATATGTTTTTACAAACTAAGTCACAGTT 720 

Qy 721 ttcaaaattccataacttagaatttttcgctgcaaaataaagtaatcatttaagtgtttt 780 

Db 721 TTCAAAATTCCATAACTTAGAATTTTTCGCTGCAAAATAAAGTAATCATTTAAGTGTTTT 780 

Qy 781 ttctgtaataaaataaataaataattttaacgagtattttcctaaaaattggaaattgat 840 

Db 781 TTCTGTAATAAAATAAATAAATAATTTTAACGAGTATTTTCCTAAAAATTGGAAATTGAT 840 

Qy 841 ttaccaaaattagtatgtcaaaacacatgtttatatgttacagggcgatatcgtctaggc 900 

Db 841 TTACCAAAATTAGTATGTCAAAACACATGTTTATATGTTACAGGGCGATATCGTCTAGGC 900 

Qy 901 aaataacatctaggcggggtttggagtgttacagggcgagtgggctcattttgagtaagt 960 

Db 901 AAATAACATCTAGGCGGGGTTTGGAGTGTTACAGGGCGAGTGGGCTCAHTTGAGTAAGT 960 

Qy 961 atagttagggccgagttttagattgcatattcaaggtcaaagattttgtaaacttcgatg 1020 

Db 961 ATAGTTAGGGCCGAGTTTTAGATTGCATATTCAAGGTCAAAGATTTTGTAAACTTCGATG 1020 

Qy 1021 aatgatatgtatgattgtccgattaacgaaatatgtttttttcttttgtgtgtgttttat 1080 

Db 1021 AATGATATGTATGATTGTCCGATTAACGAMTATGTTTITTTCTTTTGTGTGTGTTTTAI 1080 

Qy 1081 ctcgtgtgataagtatatagtatgttttattccaattcttatggcatgtgacattgtggc 1140 

Db 1081 CTCGTGTGATAAGTATATAGTATGTTTTATTCCAATTCTTATGGCATGTGACATTGTGGC 1140 

Qy 1141 tattctaattaaattgatttgttattattgaaatctgatgcatctgttctacaaagcatg 1200 

Db 1141 TATTCTAATTAAATTGATTTGTTATTATTGAAATCTGATGCATCTGTTCTACAAAGCATG 1200 

Qy 1201 gaatctcatgcctactgctttctgttaaagatacgattgcaagtttaacatgcttactat 1260 

Db 1201 GAATCTCATGCCTACTGCTTTCTGTTAAAGATACGATTGCAAGTTTAACATGCTTACTAT 1260 

Qy 1261. tttgattttgtccttgcatgctatgtcacattacatggggttgggatgatatggtaagga 1320 



Db 1261 TTTGATTTTGTCCTTGCATGCTATGTCACATTACATGGGGTTGGGATGATATGGTAAGGA 1320 

Qy 1321 ggaagttttgacagtttaatgatttgcactatctggtggtttaaccacatatttgttatg 1380 

Db 1321 GGAAGTTTTGACAGTTTAATGATTTGCACTATCTGGTGGTTTAACCACATATTTGTTATG 1380 

Qy 1381 gcatcttgactgcggttatggtggctcgaccgcccatatctgttctggaaatttatctgt 1440 

Db 1381 GCATCTTGACTGCGGTTATGGTGGCTCGACCGCCCATATCTGTTCTGGAAATTTATCTGT 1440 

Qy 1441 gactctggtggcattgtctacaattatttgttggtgtgttttggatggacgagtcgtggg 1500 

Db 1441 GACTCTGGTGGCATTGTCTACAATTATTTGTTGGTGTGTTTTGGATGGACGAGTCGTGGG 1500 

Qy 1501 gaactctatttggtgtgttgcggagttgggtaggaaattttcgaaaaaaatttgcattgt 1560 

Db 1501 GAACTCTATTTGGTGTGTTGCGGAGTTGGGTAGGAAATTTTCGAAAAAAATITGCATTGT 1560 

Qy 1561 gtttttctgaaaaatattgcattaacataatcatgcattctcaattttggtcaattgaac 1620 

Db 1561 GTTTTTCTGAAAAATATTGCATTAACATAATCATGCATTCTCAATTTTGGTCAATTGAAC 1620 

Qy 1621 gttataaaattctctatgatatcctgatctgtttattacattatatgtgtttatgcttga- 1680 

Db 1621 GTTATAAAATTCTCTATGATATCCTGATCTGTTTATTACATTATATGTGTTTATGCTTGA 1680 

Qy 1681 gttaagtcaaacattgagattcatagctcacccaattatttaatcatttcaggcaatctg 1740 

Db 1681 GTTAAGTCAAACATTGAGATTCATAGCTCACCCAATTATTTAATCATTTCAGGCAATCTG 1740 . 

Qy 1741 cagacttaggattggatggcgttcaggagcttggattggttttctcacatcatattttat 1800 

Db 1741 CAGACTTAGGATTGGATGGCG1TCAGGAGCTTGGATTGGTTTTCTCACATCATATTTTAT 1800 

Qy 1801 taaataattattaattaaaatttatggacttttggactgtctgactaattttcagaattt 1860 

Db 1801 TAAATAATTATTAATTAAAATTTATGGACTTTTGGACTGTCTGACTAATTTTCAGAATH- 1860 

Qy 1861 tattttggttttgggttttgttgaattttttagataattattttaaatattctgcataat 1920 

Qy 1921 ttttctgttatttgaaaaggatgttcgaattttttttcaaaattgaaacgtttaagaatt 1980 

Db 1921 TTTTCTGTTATTTGAAAAGGATGTTCGAATTTTTTTTCAAAATTGAAACGTTTAAGAATT 1980 

Qy 1981 tttactactgcaaattcagaataagtgaatttgttttttagaaagattaaataagttagt 2040 

Db 1981 TTTACTACTGCAAATTCAGAATAAGTGAATTTGTTTTTTAGAAAGATTAAATAAGTTAGT 2040 

Qy 2041 attacgatttttagtttgatttggtggaaagtaatgtatgtttttgaacataattatttg 2100 

Db 2041 ATTACGATTTTTAGTTTGATTTGGTGGAAAGTAATGTATGTTTTTGAACATAATTATTTG 2100 

Qy 2101 acaataattaagttttctagggaataaacggaaatatcttcttcttttttgtaaaattac 2160 

Db 2101 ACAATAATTAAGTTTTCTAGGGAATAAACGGAAATATCTTCTTCTITTTTGTAAAATTAC 2160 

Qy 2161 taatgcaagaacaaacaacgttttggggagcaaataatctagctttaagtagtcagtgta 2220 

Db 2161 TAATGCAAGAACAAACAACGTTTTGGGGAGCAAATAATCTAGCTTTAAGTAGTCAGTGTA 2220 

Qy 2221 actctcaaaatctggtcataacttctaggctgagtttgctgtgctacagtagtaagtcta 2280 

Db 2221 ACTCTCAAAAICTGGTCATAACTTCTAGGCTGAGTTTGCTGTGCTACAGTAGTAAGTCTA 2280 

Qy 2281 tagaaacttacctgacaaaacgacatgacgtcagggtcgaatctacaacttttccttttt 2340 

Db 2281 TAGAAACTIACCTGACAAAACGACATGACGTCAGGGTCGAATCTACAACTTTTCCTTTTT 2340 

Qy 2341 cttcaattaacatatggttgattcaagttccgatctataataatttattacgatttatca 2400 



